### Deconvolution of the Raman spectrum of the SiGe multilayer

We denote by the symbol Ф the experimental Raman spectrum of the multilayer, as shown in Figure

1a (black). Symbol Ф implicitly contains all the information about the composition profiling. Its deconvolution process starts by collecting an ensemble of a number

*n* of ϕ

_{
x
} spectra with 0 <

*x* < 1. The collection must cover a spectral range

*R*, including the three SiGe main Raman modes. Each ϕ

_{
x
} must be normalized in order to have its integrated intensity on

*R* equal to 1. The deconvolution of Ф is then obtained by finding the set of

*n* coefficients

*a*_{
x
} that minimize the quantity

^{a} Δ

_{1} defined as follows:

${\Delta}_{1}={\left(\Phi -{\displaystyle \sum {a}_{x}{\varphi}_{x}}\right)}^{2}\text{.}$

(1)

The determination of the spectral contributions *a*_{
x
} consists in a deconvolution of the experimental spectrum, Ф, and represents an important added value with respect to the methods based on Raman spectroscopy found in previous works[12, 17] where one must rely on the assumption to have a homogeneous sample. The set of *a*_{
x
} unfolds the presence of a distribution of compositions within the probed layer. In addition, the knowledge of *a*_{
x
} can be also the starting point for a reconstruction of the composition profile. We will demonstrate these features by analyzing a SiGe multilayer and SiGe self-assembled islands.

The deconvolution must be carried out starting from a set of ϕ_{
x
} that can be measured or taken from the work of Picco et al.[15]. The starting point of the procedure is a choice of a basis formed by a limited number of ϕ_{
x
} and therefore a relatively high value of Δ*x*. At this point, whatever the initial seeds, *a*_{
x
}, the algorithm always converges to the same unique solution. The next step is to decrease Δ*x*, and therefore to increase the number of ϕ_{
x
}, and repeat the algorithm. The process continues as long as shrinking Δ*x* leads to a decrease of Δ_{1}, and the solution keeps on being unique. When the uniqueness condition is not satisfied, the solution of the algorithm is discarded, and the process ends. In the example reported here, we obtained the best spectrum with a unique solution using 21 ϕ_{
x
} spectra with a sampling interval Δ*x* of 0.05. The minimization of Equation 1 leads to the generation of a reconstructed spectrum Σ*a*_{
x
}ϕ_{
x
} (Figure 1a, upper red spectrum) which most closely approximates Ф. The inspection of the *a*_{
x
} values shows that the spectrum has contributions from compositions around [0.20 to 0.25], [0.40 to 0.45], [0.55 to 0.65], and [0.75 to 0.95], as shown in Figure 1b. These *x* values are in agreement with the nominal values of composition within an accuracy Δ*x* ≈ 0.10, a figure that demonstrates the potential of this kind of analysis.

Notice that there are some limitations to the applicability of this procedure. This method cannot work in those nanostructures where a strong phonon confinement takes place, typically with dimensions of a few nanometers. In all other cases, the main problems related to the determination of *a*_{
x
} rise from mechanical strain. Since the ϕ_{
x
} spectra are acquired from unstrained bulk, the effect of strain on Ф cannot be taken into account from Equation 1. The small discrepancies in Figure 1a are indeed related to the small residual strain in the stack. Other works in the literature[12] show, for example, that a biaxial strain of the order of 1% modifies the spectral position of the Si-Si band like a change in *x* of about 10%. In this sample, we checked the residual strain by X-ray diffraction in the layers (+0.3%, +0.1%, +0.1%, and −0.1% from top to bottom); therefore, the success of the deconvolution of Figure 1 relies in the fact that the strain was small enough not to corrupt significantly the analysis. It is important to underline that Figure 1b was obtained from one single spectrum and shows clearly that Ф is a superposition of several contributions, shedding light on Ge inhomogeneous distributions in the sample: a result which is beyond the grasp of the optical methods reported in the literature.

### Composition profiling of the SiGe multilayer

The

*a*_{
x
} values show that the spectrum is the result of a superposition of different spectra from different compositions, but they cannot be taken directly as a quantitative measure of the relative abundance of each composition. The reason is that the

*a*_{
x
} values are influenced also by the Raman efficiency

*S*_{
x
} and by the spatial attenuation of the excitation laser. Nevertheless, once

*S*_{
x
} are known[

15], it is possible to extract from the Ф decomposition further information about the inner structure of the sample, namely the thickness

*d*_{
x
} of each layer displaying a constant composition. The test sample is a stack of layers, and we can look for the set of

*d*_{
x
} that minimizes the quantity Δ

_{2} defined as follows:

${\Delta}_{2}={\displaystyle \sum {\left(\frac{{a}_{x}}{{\displaystyle \sum {a}_{x}}}-\frac{{I}_{x}}{{\displaystyle \sum {I}_{x}}}\right)}^{2}}\text{,}$

(2)

where

*I*_{
x
} is a quantity proportional to the spectral contribution

*a*_{
x
} and is expressed as follows:

${I}_{x}={e}^{-2m}\phantom{\rule{0.12em}{0ex}}{S}_{x}\phantom{\rule{0.12em}{0ex}}\left[{L}_{x}\left(1-{e}^{-2{d}_{x}/{L}_{x}}\right)\right]\text{,}$

(3)

being

*L*_{
x
} the laser penetration depth for the concentration

*x*[

17]. The quantity in square brackets is proportional to the integrated excitation of the single layer, where the excitation decreases as a function of depth

*z* as exp(−

*z*/

*L*_{
x
}), and factor 2 takes into account the self-absorption. Quantity

*m* is defined as follows:

$m={\displaystyle \sum {d}_{{x}^{\prime}}/{L}_{{x}^{\prime}}}\text{,}$

(4)

where it must be summed up for all the layers *x′* above the layer *x* so that *e*^{− 2m} represents the attenuation of the excitation and collection for the buried layer *x*. The minimization of Equation 2 leads to the determination of *d*_{
x
}. In this part of the procedure, the main source of error is the knowledge of *L*_{
x
}, especially for the buried layers, where this uncertainty is more relevant. This can be observed, for example, by plotting a spectrum reconstructed from the *d*_{
x
} values obtained by SEM (Figure 1a, lower red spectrum), while the Si-Si and Si-Ge modes are well reproduced because they are generated mostly by the upper layers; the simulation of the Ge-Ge mode is not accurate since it is generated by the lower buried layers. In addition, the condition *d*_{
x
} >> *L*_{
x
} must be false for all the layers involved in the simulation, i.e., each layer must be properly excited by the laser throughout all its thickness, and this represents another possible limitation of the technique. In this sample, the deepest layer did not satisfy this condition and was excluded from the minimization. The results are reported in Figure 1c (red line). During minimization, we fixed the total thickness of the stack *d*_{stack} and imposed that Ge-richer layers were below Si-richer layers. Within the region of the sample involved in the simulation, for each value of the vertical position *z*, the composition profile obtained with the Raman analysis matches the profile independently measured by SEM (Figure 1c, black) within the remarkable accuracy of 10%.

### SiGe nanoislands

In order to test its validity on different structures, we applied this method to the case of self-assembled SiGe islands grown by Stranski-Krastanov process on a flat Si(001) substrate. The experimental spectrum of the islands, as shown in Figure 2a, was analyzed with Equation 1. Each island is treated as a stack of SiGe homogeneous disks. The signal from the substrate is excluded from the algorithm. Contributions of compositions between 0.25 and 0.50 are detected, with two major components at 0.35 and 0.45, as shown in Figure 2b, indicating a quite homogeneous inner composition. In addition to the deconvolution, we tried to get a coarse indication of the vertical profile of *x* by approximating a SiGe island as a multilayer stack with high Ge content on top (therefore neglecting a lateral variation of composition). The composition profile was obtained through the minimization of Equation 2 with the only constraint on the total thickness given by the AFM profile. The results in Figure 2b show values of *x* which are compatible with those results that one expects are from islands grown at similar temperatures[18]. By comparison with the results in the literature[18], we observe that despite the roughness of the approximation within the islands, the vertical distribution of *x* is in agreement with the trend observed with other techniques. Furthermore, Figure 2c shows that in this experiment the effect of strain cannot introduce an artifact in the determination of the composition of more than 0.10. We underline that the profile was obtained with a nondestructive measurement from one single spectrum.