Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Compressive imaging for thwarting adversarial attacks on 3D point cloud classifiers

Open Access Open Access

Abstract

Three dimensional (3D) point cloud classifiers are used extensively in safety crucial applications such as autonomous cars, face recognition, military applications, and many more. Despite the critical importance of their reliability, 3D classifiers are prone to adversarial attacks that can be crafted in the real world. While it is possible to use known methods to prevent adversarial attacks, they can be easily counter-attacked, leading to an arms race between the attacker and the defender. Here, we propose to use 3D compressive sensing to recover an original label of the 3D object. Since compressive sensing inherently encodes the 3D signal, it also prevents the arms race between the attacker and the defender. The 3D compressive sensing we consider is a single pixel camera (SPC) system that can be implemented in Light Detection and Ranging (LiDAR) systems.

© 2021 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Deep learning (DL) algorithms are extensively used for the analysis and processing of captured data such as images. One of the most common uses for DL as a state-of-the-art method is object classification and recognition. These tasks are essential for many applications, such as for security, defense, and social networks (e.g., for recognizing different users), for medical imagers (e.g., for recognizing tumors, fractures, and tissues), for robotics, and autonomous cars (e.g., for recognizing cars, lanes, traffic signs, and pedestrians).

Among the different uses for DL classification, the classification of three-dimensional (3D) objects has the utmost importance in various fields such as autonomous driving, virtual reality, military uses, among many others.

While DL classification methods have shown state-of-the-art performance, it was discovered that deep neural networks (DNN) are susceptible to adversarial examples – images designed to mask an object's original label while seeming unchanged to the human eye. The attacker may also fool the classifier to recognize the object as a particular target. Such a process is called a targeted attack. Since the discovery of adversarial attacks, many adversarial attack methods have been demonstrated on different image classifier networks [1]. Adversarial attacks are not restricted to carefully crafted digital data. In fact, numerous methods have been shown to design adversarial examples in the real world. For example, real-world adversarial attacks have been applied on road sign attacks [2], printed attacks [3], adversarial patches [4], 3D printed objects [5], glasses [6], pattern projections [7], amongst others.

Despite the fact that 3D object classifiers are extensively used in safety crucial applications such as autonomous cars, face recognition for internet payment, military applications, and many more, the 3D classifiers are also prone to adversarial attacks. Recently, it was shown that adversarial attacks can be designed and then created in the real world, which in turn are able to fool 3D point cloud classifiers [8].

With the discovery of adversarial attacks, many defense methods have been researched [1,9]. One simple defense method is to use “filtering”; post-image acquisition processing to clean adversarial perturbations from the image presented to the DNN. Another common method is adversarial training, by which the DNN is re-trained to be able to recognize and disable the attacks. However, all such methods are prone to counter-attacks, which take into account the defense, leading to an arms-race between the attacker and the defender [10]. In this work, we use optically encoded filtering to stop this arms race.

To understand the arms-race between the attacker and the defender, let us consider the following scenario. A developer of a software recognition system integrates a known pre-trained object recognition DNN readily available off the shelf. In turn, an adversary, knowing which DNN was used in the system, examines the pre-trained DNN and designs an adversarial example that is able to fool the original DNN. This kind of attack scenario, in which the adversary has access to the DNN architecture, its parameters and training conditions, is referred to as a white-box attack scenario. In this work, we consider the white-box attacks, which are the most powerful threat model.

The DNN developer, after discovering the existence of the adversarial attack, may, in turn, apply standard image processing tools on the adversary image in an attempt to remove the attacks. However, if the defense method and its parameters are known to the adversary, he may develop a new attack that is adapted to the defense method. In turn, the DNN developer can apply additional defense tools, which may stimulate the adversary to develop a new counter-attack adapted to the last defense tools employed. Such cycles may repeat indefinitely, as each side analyzes the other side’s operations and overcomes them. Thus an arms-race between the defender and the attacker is established. It should be noted that in this arms-race, the attacker always has the upper hand. This arms-race could be stopped if the DNN developer conceals the defense method and its parameters, for instance, by encrypting the defense process [11]. With the defense method encrypted, the attacker can’t generate further attacks without knowing the specific key used for encoding.

Previously [12], we have shown by simulations and experimentally that compressive sensing (CS) has the potential to recover the original label of adversarial 2D images. We demonstrated that by replacing the standard 2D imager with a CS imager, the adversarial examples can be cleaned, and attacks can be thwarted. Here, we explore this novel approach of combining compressive sensing and optical encryption as a defense method against adversarial attacks on 3D images. While the CS reconstruction process, which is applied after the 3D image acquisition and before the DNN, serves as a post-processing true label recovery method, and at the same time the physical CS system serves as an optical encoder, preventing the attacker from designing a counter-attack, a.k.a. an adaptive attack. If an adversary wishes to apply an adaptive attack, he/she needs first to decrypt the physical imaging system. The only way to achieve this is by cracking the encryption. However, since the CS encryption has to be hacked in the real world and optical encryptions enjoy huge key spaces, it is very difficult and time-consuming for the attacker to implement further adversarial attacks.

The main contribution of this paper is to demonstrate that CS can thwart a robust adversarial attack that has been demonstrated in the real world on 3D point cloud classifier. The CS acquisition process offers inherent encryption of the 3D images and thus can prevent further attack attempts on the network, avoiding the typical arms race between the defenders and attackers.

2. Real-world 3D adversarial examples

In this paper, we demonstrate the CS defense against adversarial attacks on DNN classifiers working on 3D images and on real-world 3D objects. As a case study, we consider a Carlini-Vagner [13] adversarial attack that was recently shown to be successful in the physical world 3D objects [8]. The 3D object classifier considered is the PointNet [14].

A standard 3D image acquisition and classification process is illustrated in Fig. 1(a). A real-world object is scanned using a LiDAR imaging system. The samples are then loaded into a computer as point cloud data which then are classified by a DNN. Following [8], to fool the classification process, the attacker designs an adversarial 3D model of the object on the PC. The 3D model is then printed using a 3D printer and scanned by a LiDAR imager. As before (Fig. 1(a)), the samples are then sent to a PC as a point cloud data to be classified by the same DNN. This time the adversarial objects are incorrectly classified due to the adversarial perturbations. Our aim in this paper is to thwart this kind of attack on the DNN block in Fig. 1(b).

 figure: Fig. 1.

Fig. 1. (a) An exemplary scheme of the 3D object classification from a point cloud data captured by a 3D LiDAR system. (b) An exemplary scheme of the 3D attack process [8]. The adversarial example is designed on the PC and printed with a 3D printer, then it is sampled with a LiDAR system and classified incorrectly by the classifier.

Download Full Size | PDF

The adversarial examples are designed by searching for a 3D point cloud that is both close in shape to that of the original 3D object while simultaneously having small perturbations that force the cost function of the point cloud to be closer to one that the DNN classifies as another label rather than the original label of the 3D item. The adversarial examples are generated using the Carlini-Vagner method [13]. The attack is formulated as:

$$\begin{array}{c} {\mathop {\textrm{argmin}\;}\limits_\delta f({x + \delta } )+ c \cdot {{||\delta ||}_p}}\\ {\textrm{with}\;f(x )= \max \left\{ {\mathop {\max }\limits_{i \ne y^{\prime}} \{{Z{{(x )}_i}} \}} \right\} - Z{{(x )}_{y^{\prime},\kappa }}} \end{array},$$
where δ ∈ ℝn×3 is the adversarial perturbation vector, x ∈ ℝn×3 is the point cloud vector, c is a hyperparameter used to balance the terms in the objective function, and p denoted the Lp norm. The function Z (·) is the output of the logits layer, y’ is the attack target and κ is the parameter that controls the attack confidence.

Since the original attack in Eq. (1) is prone to have discontinuous points and high-frequency perturbations, the original label can be easily recovered by applying some kind of pre-filtering that smooths the sampled point cloud before it is presented to the DNN classifier. For example, the k-Nearest Neighbours (k-NN) distance minimization [8] can be applied on the adversarial point cloud data, as illustrated in Fig. 2. After the filtering, the adversarial point cloud becomes more similar to the original object, and the original label is recovered by the DNN.

 figure: Fig. 2.

Fig. 2. An exemplary scheme of a true label recovery process by applying kNN based pre-filtering.

Download Full Size | PDF

Unfortunately, pre-filtering methods, such as the k-NN described before, are prone to counter-attacks [10]. To design adversarial examples that are invariant to such pre-filtering it was proposed in [8] to add a regularization term to (1) so that the new point cloud is subject to k-NN distance minimization. Additionally, it was also proposed to add a point cloud distance metric instead of the Lp norm.

The final minimization problem is then defined by

$$\mathop {\textrm{argmin}\;}\limits_\delta f({x^{\prime}} )+ \alpha \cdot {D_C}({x^{\prime},x} )+ \beta \cdot {D_\kappa }({x^{\prime}} )\;,$$
where x’ = x + δ, α and β are user-defined parameters to balance the constraints, DC (x’, x) represent the Chamfer distance [15], used to measure the distance between the points in the point cloud, and Dκ (x’) is the k-NN distance. Now, since the adversarial process is designed with the k-NN distance prior, this kind of filtering is no longer able to recover the original label (see Fig. 3).

 figure: Fig. 3.

Fig. 3. Scheme depicting an example of a second, adaptive adversarial attack meant to fool a defense that uses the k-NN based pre-filtering. The new adversarial attack is devised such as the new adversarial example has a small k-NN distance, thus making filtering that causes small pixel variations (i.e., k-NN distance, TV minimization, etc.) fail as a label recovery method.

Download Full Size | PDF

In fact, the defender can proceed to design new filters or re-train the network to combat the adversarial examples. Subsequently, the attacker can then add new terms to combat the new changes, yielding an arms race of attack and defense between the attacker and the defender. To make a stop to this seemingly infinite cycle, we propose the CS encoding in the following section.

3. Sensing scheme

3.1 Compressive sensing

Compressive sensing (CS) [1620] is an efficient sensing technique that, subject to certain technical requirements, can enable sub-Nyquist sampling. This means that if the n by n image we wish to sample is an N = n2 long signal denoted by x, then we can capture only M < N samples denoted by g, to reconstruct the image faithfully. The sensing, in this case, is mathematically formulated by

$$\textbf{g} = \boldsymbol{\Phi}\textbf{x}$$
where Φ ∈ ℝM×N is the sensing matrix that models the acquisition process.

The sensing matrix causes a dimensionality reduction, mapping the samples to an unstructured space. The resulting sensed data x is encrypted by the sensing matrix, making the data visually unperceivable. Therefore, CS can be considered as an efficient encryption method [21].

After the sampling process, the compressive samples can be reconstructed by an iterative process subject to prior knowledge about the original signal. Such prior knowledge can be if the signal is sparse in some kind of transformation or has a small amount of spatial variation [17,19,20]. For LiDAR 3D imaging, we propose to implement a well-researched single-pixel camera imager [20,22,23]. A SPC-based LiDAR scheme is illustrated in Fig. 4 [23]. In the system, a short laser pulse illuminates the subject. The returned light is then projected by a lens onto a spatial multiplexing device such as the Digital Micromirror Device (DMD). The DMD encodes the incoming light according to a predefined sensing matrix Φ from where the reflected, encoded light is collected by a convex lens onto a photon-counting time detector.

 figure: Fig. 4.

Fig. 4. Illustration of the experimental setup.

Download Full Size | PDF

After the sampling process, the compressed and encrypted M depth sequences are sent to a PC for reconstruction. As a sensing matrix, Φ, we consider here a well-known Hadamard sensing matrix [24], denoted by H. The Hadamard matrix has many advantages for use in SPC CS applications. First, it can be constructed recursively and has a fast transform, similar to the fast Fourier transform [24]. Second, the Hadamard matrix can be sampled with variable density [2527], which improves significantly CS reconstruction quality. And finally, since the Hadamard matrix is bipolar, it is easy to implement in the SPC system.

Our simulations were realized using bipolar −1 and 1 values of the Hadamard matrix. The bipolar matrix multiplications can be realized on the unipolar DMD using four common options: by adding a second detector and measuring both positive and negative values separately [28,29], by using a single balanced detector [30], by separating the patterns to positive and negative patterns and thus doubling the number of measurements [28], or even without using additional samples or detectors [31].

During the sampling, we choose M rows out of N rows of the Hadamard matrix, pseudo-randomly, according to variable-density sampling [25]. Notice that this leads to $\left( {\begin{array}{c} N\\ M \end{array}} \right) = \frac{{N!}}{{M!({N - M} )!}}$ possible sampling matrices and therefore, the keyspace of the encryption becomes extremely large. Furthermore, it is possible to scramble the columns of the Hadamard sensing matrix [32], thus increasing the key space by a factor of N!

3.2 Optical compressive defense against adversarial attacks

In [12] we have shown that CS can be used to recover the original label of adversarial 2D images. Here, we extend this concept to 3D LiDAR imaging. The compressive reconstruction of the encoded data cleans up the adversarial perturbation and helps to recover the original label of the 3D adversarial example (see Fig. 5). Furthermore, the CS acquisition process works as an optical encryption mechanism [21,33] which stets a major hurdle against any counter-attack attempts.

 figure: Fig. 5.

Fig. 5. An exemplary scheme of a true label recovery process after the adversarial attack. In the example, we propose to use a compressive SPC LiDAR system with encrypted sampling patterns. The CS sensing matrix works as an encryption key. The encrypted compressive samples are then reconstructed using an iterative 3D TV minimization process.

Download Full Size | PDF

CS imagers have an inherent encrypting property [33], which encrypts and compresses the data with an encryption key determined by the particular sensing matrix. This encryption is done entirely during the acquisition step on the hardware. The key, which determines the particular sensing matrix used in the acquisition step, needs to be shared (through a secure distribution channel) with the CS reconstruction algorithm (Fig. 5).

3.3 Robustness to adaptive attacks

Since the sensing matrix encrypts the compressed data, the attacker cannot implement the CS process during the adversarial example design, making further attacks on the DNN impossible unless the compressed data is decrypted (see Fig. 6).

 figure: Fig. 6.

Fig. 6. Contrary to the k-NN based pre-filtering defense against adversarial attacks, the encrypted CS approach cannot be simply simulated by the Adversarial Attack process without knowing the encrypted key. Thus, further attacks cannot be devised.

Download Full Size | PDF

The encrypted CS together with the reconstruction process can be viewed as a pre-filtering process that attempts to filter out the attacking components of the adversarial example. While pre-filtering is a well-known approach, it is also known to be vulnerable to adaptive attacks [10]. This kind of adaptive attack assumes that the attacker has a significant amount of information about the pre-filtering model. However, since the encrypted CS is concealed from the adversary by encryption, the attacker cannot generate attacks that employ the pre-filtering model.

Since the complete encrypted CS model is unavailable to the attacker, the adversary may try a “black box” approach to generate an adaptive attack. One such technique is to estimate the model by querying the overall system. While such a technique may be feasible with software-implemented pre-filtering, it is not with the proposed hardware encryption. Since the SPC camera performs the CS encryption at the optical domain, querying the complete system would be implausible. Typical numbers of queries range between $\mathrm{{\cal O}}({{{10}^4}} )$ to $({{{10}^6}} )\; $[34]; generating so many optical targets x, placing them in front of the system, and measuring the outputs would take a prohibiting long time.

In a severe scenario, the attacker may gain access to the output of the encrypting SPC, that is to compressed image, g. In such a scenario, he/she might attempt to crack the encryption to retrieve the key. However, due to the huge key space of our encryption, a brute-force attack is out of the question. For example, since we choose M pseudo-random patterns out of N possible Hadamard matrix rows, it leads to a keyspace of $\left( {\begin{array}{c} N\\ M \end{array}} \right)$. Even more, to check if the estimated guess of the sensing matrix is any close to the real one, the attacker would need to apply a slow iterative reconstruction process that can range between a second and a dozen minutes, depending on the data size.

In a more favorable scenario, from the adversary's point of view, he might gain access to the information of the sensing model type (Hadmard in our case) and to its size (M × N). In such a case, he can apply a known-plaintext attack [35], with which only N interrogations of the encryption system are sufficient. For such a case, to avoid cracking the encryption, the key can be alternated after M × Tquery seconds, where Tquery is the total time needed to craft the desired target, place it in front of the camera, and capturing the respective compressed data, g.

4. Experimental demonstration

We have simulated the sensing process of adversarial 3D CAD models following the process in Fig. 3. To check how the CS SPC process affects the 3D adversarial attack, we simulated the 3D LiDAR sampling with the compressive SPC, followed by the PointNet classifier [14], as depicted in Fig. 5. The CS simulations were done on an i7 CPU with 32Gbytes of RAM. The reconstruction method was done by total variation (TV) minimization with the NESTA solver. For this research, we used the test set of the ModelNet40 dataset [36], which contains CAD models of 3D objects from 40 categories.

In Fig. 7 we show how the 3D point cloud was visually modified due to a targeted adversarial attack designed to fool the classifier to classify the 3D CAD models as cone. For this example, we chose six exemplary 3D CAD models: bottle, bowl, tent, bed, person and vase. After the attack, we simulated 3D CS. The reconstruction result of the compressed samples is also included in Fig. 7. Notice that the adversarial model differs considerably from the original in most cases. While the CS process was not able to completely remove those differences, after the CS process, the classifier was able to successfully recognize the original label with high confidence (higher than 0.9 for all six cases).

 figure: Fig. 7.

Fig. 7. A visual comparison between the original 3D point cloud model, the attacked point cloud, and the attacked point cloud after CS. The original 3D CAD models are: bottle, bowl, tent, bed, person and vase. The classified labels after the targeted attack were cone. After the CS process, the classified labels were recovered successfully to their original labels with high confidence, over 0.9 for all the 3D CAD models.

Download Full Size | PDF

To test our method on different 3D object models, we chose 18 arbitrary 3D object classes and 30 random 3D CAD models per class. For each 3D CAD model, we applied a targeted adversarial attack [13], causing the classes to be classified as cone. After each successful attack, we simulated the CS process (Fig. 5) in order to recover the original label.

We evaluated the label recovery performance by calculating the ratio between the models that were successfully classified by the PointNet classifier and the total amount of successfully attacked 3D object models. We denote this as true recovery rate. In addition, we tested the ability of CS to dodge the targeted attack by calculating the ratio of models that were classified as a different class other than the targeted class. We denote this as target dodge rate.

In Table 1 we can see that the true recovery rate varies between different labels. For this experiment, we have chosen a compression ratio M/N=0.2 which yielded slightly better results that other values. However, in this application, we didn’t notice a significant effect on recovery rates by changing the compression ratio in the ranges of 0.05 to 0.7. As a rule of thumb, a compression ratio that is too low (less than 0.05) will lead to a poor CS reconstruction quality, making it harder for the PointNet++ classifier to recognize the correct label. On the other hand, a compression ratio that is too high (more than 0.7) will lead to a correct reconstruction of both the object and the perturbations, leading to a misclassification again. Therefore, the compression ratio should be kept within reasonable bounds. It can be seen in Table 1 that some categories of the 3D CAD models were able to be correctly classified all of the time, while others weren’t able to be recovered at all. However, while 3 categories of the 3D CAD models weren’t recovered, they were successfully subverted from the targeted vase label, as can be seen by the target dodge rate column. In fact, while only 52% of the 3D CAD models were successfully recovered, 89% of them successfully dodged the targeted attack.

Tables Icon

Table 1. True label recovery rate and targeted attack dodge rate for 18 different categories of 3D CAD models.

One can notice in Table 1. that the recovery rate depends significantly on the object’s class. This is a direct property of applying a targeted attack, since the 3D targeted adversarial attack requires a different amount of perturbations for different objects for a successful attack. For example, a small amount of perturbations is required to fool the PointNet++ to misclassify vase as cone since they are structurally similar. Consequently, it is much easier to filter out those perturbations by the CS method. On the other hand, it is much harder to make the classifier misclassify the door object as cone, and therefore a larger amount of perturbations is required. This leads to an object that looks significantly different than the original label, and therefore even after the filtering by the CS method, the original label is not recovered. Another issue to keep in mind, is that when a high amount of perturbation is required to fool the classifier, sometimes the object becomes too distorted, causing its distinct features to be unrecognizable. However, please notice in the target dodge rate column in Table 1 that the adversarial perturbations were still filtered out by the CS method since the targeted label is not being recognized with a high probability for all labels, including the labels that were not correctly recovered.

5. Conclusion

In this paper, we introduced an application of compressive sensing single pixel camera for thwarting adversarial attacks on 3D point cloud data with broad applications including LiDAR based systems. We used CAD models to demonstrate the effectiveness of the proposed approach. We have shown that 89% of the time the compressive sensing was able to dodge the targeted attack, preventing the attacker from choosing a specific target for the attack. This can be especially important in defense against digital identity theft, preventing one person from posing as another. We also showed that the compressive single pixel camera can be used for encoding the 3D image in order to stop the arms race between the attacker and the defender.

Funding

Office of Naval Research (N00014-20-1-2690); Air Force Office of Scientific Research (FA9550-18-1-0338, FA9550-21-1-0333).

Acknowledgements

B. Javidi wishes to acknowledge support from The Air Force Office of Scientific Research under (FA9550-18-1-0338), (FA9550-21-1-0333) and Office of Naval Research (N00014-20-1-2690).

Disclosures

The authors declare no conflicts of interest

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. K. Ren, T. Zheng, Z. Qin, and X. Liu, “Adversarial attacks and defenses in deep learning,” Engineering 6(3), 346–360 (2020). [CrossRef]  

2. K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, and T. Kohno, and D. Song,"Robust physical-world attacks on deep learning visual classification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018).

3. A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” arXiv preprint arXiv:1607.02533 (2016).

4. T. B. Brown, D. Mané, A. Roy, M. Abadi, and J. Gilmer, “Adversarial patch,” arXiv preprint arXiv:1712.09665 (2017).

5. A. Athalye, L. Engstrom, A. Ilyas, and K. Kwok, "Synthesizing robust adversarial examples,” in International conference on machine learning (PMLR, 2018).

6. M. Sharif, S. Bhagavatula, L. Bauer, and M. K. Reiter, “A general framework for adversarial examples with objectives,” ACM Trans. Priv. Secur. 22(3), 1–30 (2019). [CrossRef]  

7. B. Nassi, D. Nassi, R. Ben-Netanel, Y. Mirsky, O. Drokin, and Y. Elovici, “Phantom of the ADAS: Phantom Attacks on Driver-Assistance Systems,” IACR Cryptol.ePrint Arch. 2020, 85 (2020).

8. T. Tsai, K. Yang, and T. Ho, and Y. Jin,"Robust adversarial objects against deep learning models,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2020).

9. N. Akhtar and A. Mian, “Threat of adversarial attacks on deep learning in computer vision: A survey,” IEEE Access 6, 14410–14430 (2018). [CrossRef]  

10. B. Biggio and F. Roli, “Wild patterns: Ten years after the rise of adversarial machine learning,” Pattern Recognit. 84, 317–331 (2018). [CrossRef]  

11. O. Taran, S. Rezaeifar, T. Holotyak, and S. Voloshynovskiy, “Machine learning through cryptographic glasses: combating adversarial attacks by key-based diversified aggregation,” EURASIP J. on Info. Security 2020(1), 10–18 (2020). [CrossRef]  

12. V. Kravets, B. Javidi, and A. Stern, “Compressive imaging for defending deep neural networks from adversarial attacks,” Opt. Lett. 46(8), 1951–1954 (2021). [CrossRef]  

13. N. Carlini and D. Wagner, "Towards evaluating the robustness of neural networks,” in 2017 ieee symposium on security and privacy (sp) (IEEE, 2017).

14. C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet: Deep hierarchical feature learning on point sets in a metric space,” arXiv preprint arXiv:1706.02413 (2017).

15. H. Fan, H. Su, and L. J. Guibas, “A point set generation network for 3d object reconstruction from a single image,” in Proceedings of the IEEE conference on computer vision and pattern recognition, (2017).

16. R. Baraniuk, “Compressive Sensing [Lecture Notes],” IEEE Signal Process. Mag. 24(4), 118–121 (2007). [CrossRef]  

17. D. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006). [CrossRef]  

18. S. Foucart and H. Rauhut, A mathematical introduction to compressive sensing (Birkhäuser, 2013).

19. Y. C. Eldar and G. Kutyniok, Compressed Sensing: Theory and Applications (Cambridge University, 2013).

20. A. Stern, Optical compressive imaging (CRC, Taylor & Francis, 2017).

21. Y. Zhang, L. Y. Zhang, J. Zhou, L. Liu, F. Chen, and X. He, “A review of compressive sensing in information security field,” IEEE Access 4, 2507–2519 (2016). [CrossRef]  

22. M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and R. G. Baraniuk, “Single-pixel imaging via compressive sampling,” IEEE Signal Process. Mag. 25(2), 83–91 (2008). [CrossRef]  

23. D. Takhar, J. N. Laska, M. B. Wakin, M. F. Duarte, D. Baron, S. Sarvotham, K. F. Kelly, and R. G. Baraniuk, "A new compressive imaging camera architecture using optical-domain compression,” in Computational Imaging IV (International Society for Optics and Photonics, 2006).

24. S. S. Agaian, H. G. Sarukhanyan, K. O. Egiazarian, and J. Astola, Hadamard transforms (SPIE, 2011).

25. Z. Wang and G. R. Arce, “Variable Density Compressed Image Sampling,” IEEE Trans. on Image Process. 19(1), 264–270 (2010). [CrossRef]  

26. V. Kravets and A. Stern,3D Compressive LIDAR Imaging Using Multiscale-Ordered Hadamard Basis,” in 3D Image Acquisition and Display: Technology, Perception and Applications (Optical Society of America, 2018).

27. A. Stern, V. Kravets, Y. Rivenson, and B. Javidi,Compressive sensing with variable density sampling for 3D imaging,” in Three-Dimensional Imaging, Visualization, and Display 2019 (International Society for Optics and Photonics, 2019).

28. J. W. Goodman, “Introduction to Fourier Optics, Roberts & Co,” Publishers, Englewood, Colorado (2005).

29. B. Lochocki, A. Gambín-Regadera, and P. Artal, “Performance evaluation of a two detector camera for real-time video,” Appl. Opt. 55(36), 10198–10203 (2016). [CrossRef]  

30. F. Soldevila, P. Clemente, E. Tajahuerce, N. Uribe-Patarroyo, P. Andrés, and J. Lancis, “Computational imaging with a balanced detector,” Sci. Rep. 6(1), 29181 (2016). [CrossRef]  

31. D. Wu, J. Luo, G. Huang, Y. Feng, X. Feng, R. Zhang, Y. Shen, and Z. Li, “Imaging biological tissue with high-throughput single-pixel compressive holography,” Nat. Commun. 12(1), 1–12 (2021). [CrossRef]  

32. L. Gan, T. T. Do, and T. D. Tran,"Fast compressive imaging using scrambled block Hadamard ensemble,” in Signal Processing Conference, 2008 16th European (IEEE, 2008).

33. B. Javidi, A. Carnicer, M. Yamaguchi, T. Nomura, E. Pérez-Cabré, M. S. Millán, N. K. Nishchal, R. Torroba, J. F. Barrera, and W. He, “Roadmap on optical security,” J. Opt. 18(8), 083001 (2016). [CrossRef]  

34. A. Ilyas, L. Engstrom, A. Athalye, and J. Lin, "Black-box adversarial attacks with limited queries and information,” in International Conference on Machine Learning (PMLR, 2018).

35. Y. Frauel, A. Castro, T. J. Naughton, and B. Javidi, “Resistance of the double random phase encryption against various attacks,” Opt. Express 15(16), 10253–10265 (2007). [CrossRef]  

36. Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao,"3d shapenets: A deep representation for volumetric shapes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015).

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (7)

Fig. 1.
Fig. 1. (a) An exemplary scheme of the 3D object classification from a point cloud data captured by a 3D LiDAR system. (b) An exemplary scheme of the 3D attack process [8]. The adversarial example is designed on the PC and printed with a 3D printer, then it is sampled with a LiDAR system and classified incorrectly by the classifier.
Fig. 2.
Fig. 2. An exemplary scheme of a true label recovery process by applying kNN based pre-filtering.
Fig. 3.
Fig. 3. Scheme depicting an example of a second, adaptive adversarial attack meant to fool a defense that uses the k-NN based pre-filtering. The new adversarial attack is devised such as the new adversarial example has a small k-NN distance, thus making filtering that causes small pixel variations (i.e., k-NN distance, TV minimization, etc.) fail as a label recovery method.
Fig. 4.
Fig. 4. Illustration of the experimental setup.
Fig. 5.
Fig. 5. An exemplary scheme of a true label recovery process after the adversarial attack. In the example, we propose to use a compressive SPC LiDAR system with encrypted sampling patterns. The CS sensing matrix works as an encryption key. The encrypted compressive samples are then reconstructed using an iterative 3D TV minimization process.
Fig. 6.
Fig. 6. Contrary to the k-NN based pre-filtering defense against adversarial attacks, the encrypted CS approach cannot be simply simulated by the Adversarial Attack process without knowing the encrypted key. Thus, further attacks cannot be devised.
Fig. 7.
Fig. 7. A visual comparison between the original 3D point cloud model, the attacked point cloud, and the attacked point cloud after CS. The original 3D CAD models are: bottle, bowl, tent, bed, person and vase. The classified labels after the targeted attack were cone. After the CS process, the classified labels were recovered successfully to their original labels with high confidence, over 0.9 for all the 3D CAD models.

Tables (1)

Tables Icon

Table 1. True label recovery rate and targeted attack dodge rate for 18 different categories of 3D CAD models.

Equations (3)

Equations on this page are rendered with MathJax. Learn more.

argmin δ f ( x + δ ) + c | | δ | | p with f ( x ) = max { max i y { Z ( x ) i } } Z ( x ) y , κ ,
argmin δ f ( x ) + α D C ( x , x ) + β D κ ( x ) ,
g = Φ x
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.