Applied Science and Convergence Technology 2023; 32(6): 141-150
Published online November 30, 2023
https://doi.org/10.5757/ASCT.2023.32.6.141
Copyright © The Korean Vacuum Society.
Seunghwan Moon^{a , b } , Jihun Kang^{a , b } , and Jong-Souk Yeo^{a , * }
^{a}School of Integrated Technology, College of Computing, Yonsei University, Seoul 03722, Republic of Korea
^{b}BK21 Graduate Program in Intelligent Semiconductor Technology, Yonsei University, Seoul 03722, Republic of Korea
Correspondence to:jongsoukyeo@yonsei.ac.kr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc-nd/4.0/) which permits non-commercial use, distribution and reproduction in any medium without alteration, provided that the original work is properly cited.
Evolving nanotechnologies and further understanding of nanophotonics have recently enabled the control of electromagnetic waves using metasurfaces. Since metasurfaces can provide diverse optical characteristics depending on their geometries, the forward design of metasurfaces conventionally has been employed through an understanding of the physical effects of each geometrical parameter. In contrast, the inverse design approach optimizes the metasurface geometry using computational algorithms. This review discusses recent studies on constructing generative models for the inverse design of nanophotonic metasurfaces. The generative model for inverse design is constructed mainly with three components: an evaluator, a generator, and a criterion. The evaluator, which can be implemented by physical simulators or deep neural networks, determines whether the input metasurface geometry satisfies the target optical characteristics. The generator suggests new possible design candidates that may have optical properties close to the target. The criterion, which includes algorithms based on mathematical optimization and artificial intelligence, manages the operation of a generative model while satisfying the convergence of optimal solutions. Inverse design takes advantage of larger design space for customized applications along with the possibility of investigating new physics, and hence it is expected to improve metasurfaces further with the emerging computational algorithms.
Keywords: Metasurface, Nanophotonics, Inverse design, Generative model, Mathematical optimization, Artificial intelligence
From ancient civilizations to modern society, humans have attempted to control light and have also sought to understand light-matter interactions. The earliest lenses identified in ancient Egyptian and Mesopotamian civilizations lenses were made of polished rock crystal or glass-like materials and were used for magnification and focusing [1–3]. Around the 3rd century before the common era (CE), the Greek mathematician Euclid wrote extensively on the properties of light and suggested the rules of reflection [4,5]. In the second century CE, another Greek scientist, Ptolemy, examined the behavior of light as it traveled through various media and produced important observations on refraction [5–7]. Substantial advances in optics were made in the period from the 8th to 14th century CE. Scholars such as Al-Kindi, Alhazen, and others conducted extensive research and created ideas on how light behaves [5]. Alhazen is also referred to as the father of modern optics because he provided theories about light, colors, vision, and concepts of reflection and refraction in his book ‘Book of Optics’ [5,8]. In the late 17th century, Isaac Newton found that white light can be separated into rainbow colors when it passes through a prism [9,10]. By observing the continuous spectrum coming out through a prism, he concluded that white light is constructed with diverse colors, and this led Newton to invent his reflecting telescope, known as the Newtonian telescope [10]. A curved primary mirror was utilized to gather and focus light, reducing chromatic aberration and producing sharper images [10,11]. Theoretical research on light was undertaken from the 17th to the 19th centuries. Scientists such as Christiaan Huygens and Robert Hooke explained light as waves propagating through matter, and this is called wave theory [12]. Thomas Young, an English scientist in the early 19th century, demonstrated the double-slit experiment and showed interference and diffraction of light, which are also wave characteristics [13,14]. Albert Einstein and Max Planck’s research in the late 19th century, however, led to the understanding of light as particles called photons, as described by particle theory [15–18]. These observations prompted scientists to infer that light shows both waveand particle-properties depending on the experimental settings.
With this theoretical understanding of light, the concept of optical antennas that manipulate light at the nanoscale was presented by researchers such as Edward Hutchinson Synge, John Wessel, and Lukas Novotny [19–22]. Optical antennas can be fabricated using various materials and structures, such as nanostructures, optical fibers, and photonic crystals, and these antennas enable optical signal concentration, directional control, and wavelength adjustment [21,23–25]. A metamaterial is a subwavelength structure that can modulate a light signal, and thus its lattice constant to operating wavelength ratio is comparably smaller than that of the photonic crystals [26–28]. Furthermore, a metamaterial is an array of metaatoms, and this repeating structure brings specialized optical characteristics according to the geometrical parameters [29,30]. A metasurface is a two-dimensional metamaterial that can be used for diverse applications such as optical communication, nanoscale light sources, and optical/plasmonic biosensors [29,31,32]. By changing the geometry of metasurfaces, it is possible to control the properties of reflected or transmitted light, such as frequency, complex wavenumber, and directionality.
In terms of designing metasurfaces with the conventional approach, the effects of the geometrical parameters of the metasurface on its optical properties should be investigated based on a physical understanding. However, there may be physics in nature that is not yet understood that could provide an analytical solution for metasurface design. In addition, when there are multiple required functionalities, various physical phenomena are entangled, and the specifications can be in a trade-off relation with each other, resulting in the high complexity of finding an optimal solution with rather tedious and laborious experimental verifications. In response to this, inverse design approaches that provide optimized solutions through computational algorithms have been suggested [33–37]. The computational algorithms for the inverse design of metasurfaces include mathematical optimization and artificial intelligence (AI) techniques, and they can be applied to 1) suggest a new metasurface design, 2) estimate the optical characteristics of the suggested metasurface efficiently, and 3) determine and optimize the metasurface design to satisfy the required specifications. The inverse design can be conducted by human resources in principle; however, for better efficiency and more iterations through automation, inverse design is generally performed by designing a computational system called a generative model. In this review, the basic concepts of computational algorithms that can be used for metasurface inverse design are introduced in Section 2, and recent inverse design studies are discussed in Section 3 by categorizing their role in a generative model.
Mathematical optimization, which is also called mathematical programming in the applied mathematics field, not only has solved economic, industrial, and technical optimization problems through analytical approaches but also has provided fundamental logic for traditional computational algorithms. Among them, topological optimization (TO) and the evolutionary algorithm (EA) are the two most widely used approaches when inversely designing metasurfaces [37, 38]. TO is a geometry optimization method employed to achieve the best performance for a specific constraint within a given design space [39,40]. The main principle of TO is to repeatedly change the material distribution of the design while maintaining structural integrity. The procedure starts with an initial design, normally a solid block or volume, and gradually strips away extraneous material from locations where it does not enhance overall performance. The material is redistributed by the optimization algorithm in a way that improves desirable attributes. EAs are a group of optimization algorithms inspired by the biological processes of natural selection and evolution [41]. EAs are particularly effective at solving optimization problems when multiple solutions or non-linear relationships make it challenging for standard mathematical or gradient-based approaches to succeed. Genetic algorithm (GA), particle swarm optimization (PSO), and ant-colony optimization (ACO) are the representative EAs that are used for metasurface inverse design [37]. The fundamental concept underlying the GA is to build a population of candidate solutions, generally depicted as individuals or chromosomes, and then let them go through evolutionary processes such as selection, reproduction, and mutation to improve oversubsequent generations [42]. Every component of the population represents a potential answer to the optimization issue, and the quality of each answer is assessed using a fitness function that measures how effectively it answers the question. PSO mimics the movement of a flock of birds or fish that updates their velocity according to each individual’s position to maintain the form of the flock [42,43]. The key point in PSO is that the surrounding individuals affect the velocity vector of one individual, affecting its next position. Therefore, in PSO, the position of the individual corresponds to the specific metasurface design, and the velocity function corresponds to how the metasurface will be changed. ACO is an algorithm that mimics ant colonies that have evolved to use pheromones to locate food [44,45]. An ant probabilistically determines its next position as a function of the remaining concentration of volatile pheromone and the distance to the point where the pheromone is located. The metasurface is optimized when the ant colony finds the optimal route to food.
While there have been efforts to consider inverse design as a mathematical optimization issue, designing metasurfaces based on AI has also been researched. Recent AI techniques have been developed focusing on machine learning (ML), which provides models and algorithms that let computers learn from experience, without having to be explicitly programmed, through supervised, unsupervised, or reinforcement learning [46,47]. Deep learning is an ML technique that utilizes deep neural networks (DNN), and it is effective in training machines from massive volumes of data [48,49]. DNNs mimicked the structure of the human nervous system by constructing layers of interconnected nodes for learning data representations in hierarchies [49]. A fully connected layer (FC), also called a multi-layer perceptron, is a fundamental DNN model in which each node is connected to all neurons in the previous layer, so every input affects the output node [50,51]. A design parameter is the number of neurons in an FC, which affects the output’s dimensionality. The weights and biases of the neurons are modified during training using optimization techniques such as gradient descent and backpropagation to reduce the discrepancy between the projected outputs and the ground-truth labels [51]. The rectified linear unit (ReLU) or sigmoid function is usually used after the output layer of the FC as an activation function for incorporating non-linearity into the neural networks [51,52]. The convolutional neural network (CNN) is another type of DNN that is widely applied to treat images as input or output data sets [53,54]. A CNN is implemented using kernels, which are small matrices (usually square matrices with a dimension of 3 × 3) used for image processing [55]. Convoluting a 3 × 3 kernel [−1, −1, −1; −1, 9, −1; −1, −1, −1] with an image consisting of a 1,024 × 1,024 data array can result in the edge sharpening effect of the input image. By using different kernels, it is possible to extract post-processed images with different strengthened characteristics: these are called feature maps [56,57]. Normally, an activation function such as ReLU is applied to the feature map for assigning non-linearity so that the neural network can learn more complex patterns [52]. The feature maps are compressed by passing through a pooling layer to reduce the computational costs and to extract spatially invariant features. The convolution set composed of a kernel, an activation function, and a pooling layer may be repeatedly applied several times [50,53,54]. Usually, the FC is connected to the last part of the CNN and used for image classification, but if an image generated in the middle of the CNN is output, the CNN can be used as an image-generating DNN [58]. Various research is being conducted by introducing the geometry of the metasurface as a pixelated image and learning it through a CNN [59]. As long as the search space is relatively small (pixelized images), the FC can be the best choice because it deals with every possible case; however, when the search space is large, such as high-definition photographs, the CNN is normally used for image processing.
To train these neural networks well, a large number of datasets is required, but it is not easy to acquire numerous datasets for the design of a metasurface and its optical properties. Therefore, the transfer learning (TL) technique, which uses models pre-learned by other datasets, is also being applied [60–63]. In TL, the range of fine-tuning and freezing is determined by the number of data to be learned and the correlation between the data to be learned and the previously learned data [64,65]. For instance, if the number of data to be learned is sufficiently large and not related to the previously learned data, it is best to train all weights in the DNN using a new data set. Otherwise, it is necessary to decide whether to fine-tune the weights of only the activation layer, or whether to fine-tune the weights of several layers along with the activation layer. In addition to these various AI techniques, quantum machine learning (QML), which utilizes the uncertainty principle in artificial neurons, has also been demonstrated recently [66–70]. The DNN for QML is called a quantum neural network (QNN), and its performance has been improved with the evolving quantum computers with the increasing number of qubits.
By employing these mathematical optimization algorithms and DNNs, it is possible to construct a generative model that converges a metasurface design to the optimal solution. Furthermore, it is not required to select only one type of algorithm, and it is possible to utilize multiple algorithms simultaneously or sequentially for a generative model. With the development of new mathematical optimization algorithms, DNNs, and computer architectures, the generative models for metasurface design and their applications can show endless possibilities.
The term ‘Generative Model’ is often considered a classification of a DNN, as opposed to the ‘Discriminative Model’; however, this should not be considered a strict classification with apparent definitions [36]. The reason is that some generative models, such as generative adversarial networks (GAN), contain the structure of a discriminative model within them [71]. Additionally, because it is possible to construct a generative model even without using a DNN, the generative model cannot be considered a subcategory of DNNs. Therefore, in this review, the generative model for metasurface inverse design is divided into an evaluator, a generator, and a criterion, as shown in Fig. 1, to discuss both DNN- and non-DNN-based algorithms. Through this, it is possible to classify which studies among various inverse design research have made novel contributions to the generative model among the evaluator, generator, and criterion. An evaluator calculates or estimates optical output according to the specific metasurface design through physical simulations or DNNs. A generator inversely suggests possible solutions that may satisfy input constraints. A criterion is an overall algorithm for providing appropriate metasurface design by managing the feedback loop between the evaluator and the generator. For metasurface design, a simulation program based on Maxwell’s equations can be an example of an evaluator, but it is also possible to design an evaluator using a DNN to reduce the computation time compared to conventional methods. In addition, an evaluator is applied to determine how well the inverse design presented by the generator satisfies the required optical properties. A DNN can also be used when designing a generator, providing a metasurface design according to the required optical properties. The criterion is an algorithm or feedback system to improve the performance of the generative model so it can be applied to design a metasurface that satisfies the required optical properties.
Since evaluators should reflect physical phenomena that occur in nature, it may be reasonable to fabricate a metasurface according to a given design and measure its optical properties to perform the role of the evaluator. However, evaluation based on practical fabrication or characterizations demands a significant amount of time, effort, and cost to carry out a single evaluation and to complete enough iterations for learning, and an unpredictable error range is inevitable. Therefore, physical simulations governing Maxwell’s equations are usually used as evaluators for metasurface design. These physical simulations start by segmenting a computer-aided design (CAD) image of a given structure into spatial mesh structures. Therefore, both finite difference method (FDM) and finite element method (FEM) simulations calculate physical equations based on a shape function. The difference between these approaches is that while FDM approximates the geometry using cubic or square meshes, FEM approximates the physical differential equations as polynomials [72]. For optical simulations whose governing physics is Maxwell’s equations, FDMs and FEMs can be categorized according to the domain of differential equations: time- and frequency-domains. FDMs with time-domain and frequency-domain differential equations are called finite-difference time-domain (FDTD) and finite-difference frequency-domain (FDFD) methods, respectively [72]. FEM normally employs frequency-domain differential equations, while the time-domain finite element method (TDFEM) refers to a FEM based on time-domain differential equations [72]. Since these methods calculate the optical properties of metasurfaces based on physics principles, they can provide logical understanding based on causal relationships if the calculation converges sufficiently to produce correct results. Most of the studies that have used physical simulation as an evaluator employed FDTD or FEM because there are several more commercialized simulation programs compared to FDFD and TDFEM [73]. Many inverse design problems can be solved without necessarily using a specific simulation method, and thus commercialized software is sufficient, but there are special problems that essentially require the use of FDFD or TDFEM [73]. As an example of the FDM serving as an evaluator, Cai
Unlike physical simulations, which must solve Maxwell’s equations for all unit cells in each structure, a DNN-based evaluator predicts optical characteristics without solving numerous equations. After training a DNN with more than thousands of data sets of metasurface design and its optical characteristics achieved by a physical simulation, it can output an estimated optical property according to the input metasurface structure in a few seconds [36,37]. As shown in Fig. 2, Liu
For the conventional metasurface design strategy, understanding the physics between the metasurface structure and incident light is important, and therefore the value of a geometrical parameter is changed within a certain range to analyze the effect of that parameter on the optical properties. For example, when optimizing a nanocylinder array design to elucidate the effect of wire width on the peak wavelength of LSPR, measuring and calculating the optical spectrum by varying the cylinder diameters from 80 to 120 nm at 20 nm intervals can provide knowledge that the LSPR peak will be red-shifted to around 700 nm with the increase of diameter [85]. With this physical understanding, for example, if a new metasurface with an LSPR peak wavelength at 750 nm is required, it is logically valid to suggest a nanocylinder array with a diameter larger than 120 nm. This ‘causality’ is the main consideration when a new metasurface is suggested or generated during the forward design process.
However, inverse design strategies do not focus on the causality of design parameters and the output spectrum. For inverse design using the EA, several generation algorithms have been provided by mimicking biological development, differentiation, and reproductions. The GA, which is the most widely used EA, implements a generator based on crossover and mutation characteristics of chromosomes. Lin
DNNs also can be used for generators. Ghorbani
The decoder part of the variational autoencoder (VAE) can also be used for the generator. An autoencoder is a serial connection of an encoder DNN and a decoder DNN, and it self-trains how to compress data efficiently with representative information such as statistical distributions [97,98]. When the input data are compressed by an encoder, the decoder inversely estimates the input data based on the compressed information [97]. By comparing the difference between the input data of the encoder and the output data of the decoder, the encoder is trained to produce core information while reducing the size of the data [97]. Using the autoencoder structure, a different type of data generation model, that is, a VAE, can be trained. The basic structure of the autoencoder and VAE are the same, but the VAE focuses on training the decoder by putting a probabilistic distribution in the encoding process to be trained as a generator [98,99]. The decoder of the VAE should estimate and generate data that are similar to the input data of the encoder. It is also possible to use only the VAE only for inverse design without it being included in the decoder part of the generative model [100–102]. From the perspective of the use of VAE as a component of a generator of generative models, the VAE-GAN was demonstrated and employed for metasurface inverse design [103-105]. Liu
A criterion that uses a logical relationship between geometrical parameters and optical response is known as a forward design strategy. Inverse design starts by using a black box called a DNN instead of the researcher’s brain neural network. But the important point is that, for a successful inverse design, the criterion must decrease the difference between the target and the sample’s optical properties and converge the metasurface geometry to the optimized solution. In forward design, optimization based on physical understanding can be seen to function as a criterion. The criterion can be categorized into mathematical optimization-based and AI-based criteria. The former is operated by offering certain constraints with mathematical logic, and the latter automatically trains itself by optimizing the weights of nodes through iterations.
Starting from the mathematical optimization-based criterion, Andkjær
If both the evaluator and generator are designed based on the DNN structure, it is possible to improve the performance of one strategy by repeating routines. A representative example is a GAN, which implies an adversarial criterion on a generative model. In the GAN framework, the evaluator tries to distinguish the real image and the generatormade (fake) image while the generator tries to generate an image that looks more real such that it cannot be distinguished by the evaluator [122–124]. If the evaluator correctly distinguishes the generatormade image, the generator gets feedback, and if the generator successfully deceives the evaluator, the evaluator gets feedback [122–124]. Since both the evaluator and generator are improved by competing, it is called the adversarial criterion. Liu
Figure 5 shows simplified possible combinations of generative models. The generative model operated by TO can use physical simulations and DNNs as an evaluator, but it does not use a generator because the TO only searches for optimal solutions in a pre-defined design space. When using the EA as a criterion, it is possible to use all types of evaluators presented in the figure, but a DNN-based generator is not allowed. For the adversarial criterion, which also means the GANbased system, both the evaluator and generator should have the form of DNNs since it trains the evaluator and generator at once. When training the evaluator, the data set generated by physical simulation is generally used. However, since the amount of data to train an evaluator is too large, a virtual data set generated by modifying and combining several simulation results is occasionally used. The generator is generally trained through iterative backpropagation; however, it is also possible to utilize a pre-trained decoder from a VAE or a denoiser from the diffusion model. This figure does not describe the strict constraints, but provides a supportive reference when constructing a generative model. For example, when utilizing conventional TO, generators should not be included for metasurface inverse design. However, by modifying or combining the algorithms to obtain a certain structure for the criterion, it can be possible to use other types of evaluators or generators to satisfy the convergence condition.
In this review, we discussed metasurface inverse design research, including mathematical optimization and deep learning approaches, by focusing on the contribution of the generative model. Since the essential steps for inversely designing metasurface are discrimination and generation, it is necessary to understand the inverse design prior arts systematically according to their role and which criterion they used for the overall generative model. For an evaluator, the conventionally and most widely used evaluator is a physical simulator based on Maxwell’s equation. Otherwise, since evaluation for the output optical spectrum is similar to classification problems using AI, FCs, and CNNs are trained based on the simulation data set and can be used as an evaluator. The generator part can be demonstrated through programming design recombination or random generation. Well-trained FCs or CNNs can be used for the generator through backpropagation, or trained DNNs under VAEs or diffusion models can also be employed for the generator directly. The criteria were discussed by categorizing mathematical optimization and AI approaches; however, the criteria can be modified and combined diversely to achieve better performance depending on the applications.
Several directions can increase the performance of the inverse design model. 1) Because inverse design cannot handle all design spaces within limited computational power, it cannot always find the absolute best value. Therefore, the problem must be accurately defined by applying various constraints and boundary conditions to obtain the optimal solution with the given cost. 2) When dealing with threedimensional metamaterials with complex functionality, developing both an evaluator and generator with a DNN can provide a more appropriate solution for exponentially increasing computation costs. 3) One of the biggest problems when using physical values as a data set for training a DNN is the high cost in terms of human and computational resources to obtain enough data. Therefore, TL and few shot learning algorithms should be considered to efficiently train DNNs. 4) One of the purposes of inverse design is to investigate unexplored physics in nature, as mentioned in the introduction. For this purpose, it is helpful to estimate the decision-making process of the DNN. However, because of the high connectivity inside a DNN, it is difficult to understand how a DNN generates output with the input. Therefore, algorithms to acquire the interpretability of DNNs should be studied using a well-trained DNN evaluator. 5) The new era of AI techniques that uses QNN can process more information in a faster time by reducing the time complexity with qubits. Computer scientists have demonstrated AI algorithms in the form of a combination of classical ML and QML. In addition, they have been replacing more steps of AI algorithms with quantum platforms to increase computing efficiency. It is also known that QNNs can be trained more efficiently with fewer data, fewer iterations, and fewer validation processes. Inverse design is a multidisciplinary research field that applies mathematics and computer science to nanophotonics. This integrative technology has shown several possibilities for overcoming the limitations of conventional forward design approaches. It is expected that inverse design research will provide new insights into how humans can better understand and control light in the future.
This research was supported by the National Research Foundation (NRF) of Korea under the “Korean-Swiss Science and Technology Program” (2019K1A3A1A1406720011) and was also supported by the BK21 FOUR (Fostering Outstanding Universities for Research) funded by the Ministry of Education (MOE) of Korea and National Research Foundation (NRF) of Korea. The authors thank Jongho Jung, Sungmin Lee, and Kichang Lee, Ph.D. students in the School of Integrated Technology, Yonsei University, for providing supportive discussions about deep neural network algorithms.
The authors declare no conflicts of interest.