The cost of CMOS image sensor pixelbased digital camera systems is being reduced through the use of smaller pixel sizes and larger fillfactors. However, CMOS pixel size reduction is only acceptable without sacrificing image quality. As CMOS pixel sizes continue to decrease, there is a reduction in image signal to noise as well as an increase in crosstalk between adjacent sensor pixels. These effects can be offset by careful design optimization through computer simulation which, at current pixel dimensions, requires full vectorial solution to Maxwell's equations.
In this topic we discuss the trends in CMOS image sensors, the implications for simulation, the types of results that can be simulated and describe the full simulation methodology to achieve them.

The figure below shows the typical image sensor pixel size as a function of year.
At larger pixel sizes, ray tracing can give excellent results. However, in the mid 2000s, pixel sizes dropped below about 5μm which makes it impossible to obtain accurate results with ray tracing while full vectorial solutions to Maxwell's equations such as FDTD give good agreement with experimental results (see Hirigoyen et al.). 
As we approach wavelength scale structure, many typical questions that we ask for larger structures no longer make sense. For example, if we generate an electronhole in Si at a particular location, we may want to know which microlens the photon passed through before being absorbed. In wave optics, this question cannot be answered because the photon, which is a wave, passes through all microlenses and the probability of it being absorbed in a particular location depends on the interference pattern it creates with itself. If we block one of the paths, for example by covering one lens, we will modify the interference pattern. In reality, the photon takes all possible paths and all paths must be included to get the correct result. It is only at larger length scales that we can ignore these multipath interference effects.
We may want to ask which path a photon took before being absorbed.
At the wavelength scale, each photon creates a complex interference pattern that determines the probability of absorption in the Si. It is not possible to determine one particular path without modifying the interference pattern. 
There are a number of things we would like to calculate: •Quantum Efficiency (QE): This is the ratio of collected electrons to incoming photons. The QE is affected by the illumination conditions, the objective lens, the Optical Efficiency (OE) of the optical stack of the image sensor and the efficiency of the collection electronics. A full calculation of the QE involves both optical and electrical modeling. •Optical Efficiency (OE): This is the ratio of electrons generated in the silicon to the number of incoming photons and is a key component of a QE calculation. Like the QE, this depends on the illumination conditions, the objective lens and the optical stack of the image sensor. Often, we include some aspects of the QE in the OE, for example, by only counting photons that are generated in a particular volume of the silicon where they have a high probability of being collected by the collection electronics. However, we still refer to this as the OE and reserve the term QE for calculations involving both optical and electrical modeling. •Optical Cross Talk, the Point Spread Function (PSF) and the Modulation Transfer Function (MTF): The ability of a camera to resolve spatial features can be measured by the PSF and it's Fourier transform, the MTF. These quantities depend on the objective lens, the optical stack and the collection electronics. The definition of these quantities is more complex in a digital camera because the pixels of the image sensor introduce a digitized discreteness to the problem. •Spectral CrossTalk and Color Matrix Coefficients: We can determine the response of different pixels, such as red, green and blue light to any wavelength and thus determine the Spectral CrossTalk. This information can be used to determine the Color Matrix Coefficients which can be used to correct the colors of an image, however, large, negative, offdiagonal terms in the Color Matrix lead to reduced signal to noise (S/N) and ideally want to be avoided. •Electrical CrossTalk: Electrical CrossTalk occurs when an electron generated under one pixel is captured by a neighboring pixel. This type of crosstalk will contribute both to the Point Spread Function and the Spectral CrossTalk. Modeling this effect requires a combination of optical and electrical modeling. 
Almost anything we want to calculate depends on the illumination conditions and the objective lens. The figure below shows the illumination of the system by a point source at a large distance from the camera. This point source will be focussed down to an Airydisk like spot at the surface of the image sensor. This type of illumination can be used to calculate the PSF, however, some care must be taken because we need to account for the variety of spot alignments that can occur with the digitized pixels on the image sensor. For many simulation results (QE, OE, Spectral CrossTalk, Color Matrix coefficients, Electrical CrossTalk), we are generally interested in the case of uniform illumination. This is the situation where the object fills the entire field of view of the camera. This situation is depicted in the figure below. The large object is composed of a large number of point sources, each of which is emitting light incoherently. This will create an incoherent sum of Airydisk like spots on the surface of the image sensor. All the results, all quanties of interest from the electric field intensity (E2) in the silicon, to the OE can be calculated simply by summing or averaging the results of each individual simulation incoherently. The electric field intensity in the silicon under uniform illumination is given by the incoherent sum of the intensity from each simulation. The OE is given by the average OE of each simulation. $$ \begin{aligned}\vec{E}^{2} &=\sum_{i=i}^{N}\left\vec{E}_{i}\right^{2} \\ O E &=\frac{1}{N} \sum_{i=i}^{N} O E_{i} \end{aligned} $$
Practically, this means running a new simulation for each beam position and summing or averaging all results incoherently. Since the image sensor array is locally periodic, we only need to do this over one unit cell (which typically involves 2x2 subpixels). In fact, once we have the results over a single unit cell, we can use these results to calculate not only the response to uniform illumination but also the response to other types of illumination, such as the illumination by an object that should ideally illuminate one full pixel. This is the approach used to calculate the PSF in Point spread function. 
It can be shown that the uniform illumination conditions described above are mathematically equivalent to a weighted, incoherent sum of all plane waves that are supported by the objective lens (see Jérôme Vaillant et al., for details). This is graphically depicted in the figure below where we can see the different possible angles of incidence that are supported by the objective lens.
The response of the system under uniform illumination is equal to an incoherent, weighted integral of plane waves at all angles of incidence supported by the objective lens. $$ \begin{aligned}\vec{E}^{2} &=\iint d k_{x} d k_{y}W(\vec{k})^{2}\vec{E}(\vec{k})^{2} \\ O E &=\frac{\iint d k_{x} d k_{y}W(\vec{k})^{2} O E(\vec{k})}{\iint d k_{x} d k_{y}W(\vec{k})^{2}} \end{aligned} $$ The weight factor W is a property of the objective lens and the pixel location. A good approximation of the of W for the central pixel in low NA systems is $$ \begin{array}{l}{W=\left\{\begin{array}{l}{1,\leftk_{1}\right \leq k_{0} \cdot \mathrm{NA}} \\ {0,\leftk_{1}\right>k_{0} \cdot \mathrm{NA}}\end{array}\right.} \\ {k_{1}^{2}=k_{x}^{2}+k_{y}^{2}}\end{array} $$ Where k is the inplane wave vector and NA is the numerical aperture of the objective lens. In other words, W is 1 for all angles supported by the the objective lens and 0 for all others.
The figure below shows the situation for a pixel near the edge of the image sensor. In this case, the range of incident angles becomes unsymmetric. We define the chief ray angle (CRA) as the ray passing from the center of the objective lens to the pixel under consideration. The weight factor used to calculate the field intensity and OE under uniform illumination changes with the position of the pixel. A good approximation of the of W for the edge pixel in low NA systems is $$ W=\left\{\begin{array}{l}{1,\leftk_{}k_{}^{C R A}\right \leq k_{0} \cdot \mathrm{NA}} \\ {0,\leftk_{}k_{}^{C R A}\right>k_{0} \cdot \mathrm{NA}}\end{array}\right. $$ Where k is the inplane wave vector, kCRA is the inplane wave vector of the CRA and NA is the numerical aperture of the objective lens. In other words, W is simply shifted in kspace by the inplane CRA wavevector.
The calculation of QE and OE is therefore identical to calculating the integral of the angular response curve. Please note though that •This integral is actually over inplane k and not angle, although there is little difference for low NA systems since k/k0 = sin(θ) ~θ when θ << 1. •This is actually a 2 dimensional integral over both kx and ky
A typical angular response curve is shown below as a function of kx/k0. In the first figure, we see the integration window centered at kx=0 which would correspond to calculating the OE for the central pixel under uniform illumination. The second curve shows the integration window for an edge pixel, which covers the same 2NA size, but is shifted by the kx of the CRA.
Now that we understand the important relationship between the angular response curves and the OE, QE or electric field intensity under uniform illumination, we can see that there are some simple steps that we can take to optimize the OE of the image sensor: 1.We can simulate only one plane wave corresponding the CRA, which is the angle at the center of the integration window. If we try to optimize the OE at this one angle, we will likely improve the entire integral. This is normally accomplished by shifting microlenses and other layers which has the effect of shifting the peak of the angular response curve to be centered on the integration window, thereby maximizing the integral. This is a very efficient way to optimize the OE of the system under uniform illumination without requiring a large number of simulations. This approach is taken when calculating optimal microlens shifts for various CRAs at Microlens shift. In this approach we are really estimating an integral by sampling only one point at the center of the integration window. 2.We can simulate a plane wave corresponding the CRA and a few angles near the edge of the integration window and combine these results with a certain weight. In this approach we are trying to optimize the integral by maximizing the function at the center of the integration window AND at the edges of the integration window  basically we are estimating an integral by sampling only a few points. Since the angular response curve is normally a fairly smooth function over the integration window, this approach should maximize our true integral. This approach is taken when optimizing the lens radius of curvature in Optimize microlens ROC. 3.We can run enough simulations to correctly calculate the angular response curves and calculate the integral correctly. This is clearly the most accurate approach but can involve a large number of simulations, particularly for 3D simulations where we need to sample both kx and ky. Sampling only 10 points for each results in 10x10=100 different angles of incidence. Still, this approach can give excellent agreement with experimental results for the OE under uniform illumination (see Hirigoyen et al.) 
The unpolarized results can be obtained from the polarized results by averaging 2 orthogonal polarizations incoherently, typically S and P. For a proof and more details, please see Unpolarized beam. Unfortunately, this means that we have to run 2 simulations for each beam or plane wave to obtain the unpolarized result. Initially, it may desirable to use only one polarization and ignore polarization effects to save time but at these length scales differences in the results for S and P polarizations are expected and these effects should not be ignored for final results. 