Chiral Space Groups
When the original Naval Research Laboratory Crystal Lattice Structures web page went on line one of the first questions we received was “Are there crystal structures in every space group?”
This was not an unreasonable question. In the fall of 2001 the site only contained 124 structures occupying 51 of the 230 space groups, an unsurprising result for a fairly new web site only updated on an ad hoc basis.
Of course the answer to the question is “yes, there are structures in every space group,” as demonstrated by Frank Hoffmann's wonderful space group list projectI,II.
A better question is how crystal structures are distributed between the space groups. Hoffmann notes that some space groups are very sparsely populated. In particular (CH)17FeO4Pt is the only compound to have ever been found in space group $P422$ #89.III,IV The Inorganic Crystal Structures Database (ICSD) has no entries for space group $P4_{2}22$ #93, although Hoffmann did find one in the literature, and the mostly organic Cambridge Structural Database (CSD) has nine entries, a mere 0.0007% of the total. At the other extreme, the 2024 CSD rankings list 404,837 entries (34%) for space group $P2_{1}c$ #14, and the most populated group in the ICSD is $Pnma$ #62, taking up 7.6% of the database.
What we'd like to do, then, is to tabulate the number of structures in each space group and then ask the questions
As it turns out, answer to both of question in the last paragraph is the same:
Before doing any enumeration of crystal structures in space groups, we should first ask exactly what it is that we wish to count. Do we consider every entry in the CSD or the ICSD as an individual structure? If we do that, then our calculations will be heavily weighted toward organic structures, since the CSD has about four times a many entries as the ICSD. Even if we separate the CSD and ICSD analysis we still have to do some thinking: the ICSD has nearly four thousand entries for compounds in the rock salt (halite) structure. Do we count all of those as one data point, or four thousand?
In this article we're going to have it both ways, or actually in multiple different ways. We'll look at the distribution of structures using the raw CSD and ICSD data, but we'll also try to lump the data into prototypes, so that the four thousand or so rock salt structures only count as one entry. When we look at the raw data, we'll find that some groups are highly favored,, e.g., $P2_{1}c$ takes up 34% of the entries in the CSD, $Pnma$ accounts for 7.6% of the ICSD, $P422$ and $P4_{2}22$ can be mostly ignored, etc. Does this favoritism persist when we go to prototypes? We'll see.
Another interesting topic is the question of what kind of space groups are favored. In our study of chiral structures we found that only 65 groups supported chiral structures, and that those were divided into two classes: Sohncke Class II groups, which are themselves chiral, and Sohncke Class III, which are achiral but have no mirror operations and so only support chiral structures. Our initial study found that only 1% of all structures, organic and inorganic, are in Sohncke II groups, 15-20% of all organic structures (depending on the source of the data) are in a Sohncke III group, while only 4% of all inorganic structures are in any of those groups.
Given that, let's discuss some categories of interest. We'll start with the types of chiral and achiral space groups discussed in the Chiral Space Groups article. There we divided the space groups into three classes, with Class I being all space groups that didn't support chiral structures and Classes II and III comprising the Sohncke space groups. In this discussion we'll further divide the Class I space groups into two categories, giving us four types. Each of the 230 3-dimensional space groups falls into one of these types:
There are three more properties of space groups that are of interest:
All of the above information is collected on the space group information page.
The study of distributions of structures in space groups goes back to at least 1942, and early work in this field is summarized by (Urusov and Nadezhina, 2009). As far as we can tell (the early works being in Russian and not readily available), this consisted of simply counting the number of reports of structures in a given space group and compiling the results. This can certainly lead to an experimental bias, which we will talk about later. For now, though, let's just look at the raw numbers. For this we'll primarily use two sources:
With all of that out of the way, let's look at the distribution of the structures in the CSD and the ICSD, simply counting the number of structure reports that occur in each space group. Start with the organic structures from the CSD: Fig. 1 shows the distribution of entries taken from the 2024 Cambridge Structural Database (CSD). Each bar indicates the number of structures found in a given space group, and the colors indicate the centrosymmetry/chirality/or lack thereof of the group. The number of structures is plotted on a logarithmic scale: over 75% of all the structures in the CSD are either triclinic or monoclinic. This may not be particularly surprising, as most organic systems are formed of molecules which do not have structural forms that easily stack into parallelepipeds. Perhaps more surprising is that the first and fourth most populated space groups are centrosymmetric $P2_{1}$ #4 and non-centrosymmetric $P2_{1}2_{1}2_{1}$ #19, both of which have screw axes. We will look at this phenomena more closely near the end of this article.
The CSD is mostly restricted to organic systems. In materials physics we're usually concerned with inorganic materials, so the CSD might not be the best source of data for our purposes. Instead we have the ICSD, where the distribution is shown in Fig. 2. Here we see that the inorganic systems are more heavily weight toward high-symmetry structures. In fact, the highest occupation occurs in the orthorhombic group $Pnma$ #62, although monoclinic $P2_{1}$ #14 comes in second. As with the organics, screw axes are favored. Although it is not apparent from the International notation, the Hermann-Mauguin symbol for $Pnma$ is $P 21/n 21/m 21/a$, showing three screw axes. In addition to that group and $P2_{1}$ (2nd place) the cubic space group $Fd\overline{3}m / F 41/d -3 2/m$ #227 (which includes diamond) in third. In fact, of the top six entries, only $Fm\overline{3}m$ #225 does not have an explicit screw operation, but the Hypertext Book shows that it has multiple $2_{1}$ axes, the four $3_{1}$ and $3_{2}$ axes found in every cubic system, and three $4_{2}$ axes.
The aforementioned gap at $P4_{2}22$ #93 is easily seen. There also appears to be a gap at $P6_{4}$ #172, but this is an illusion of the log scale as it contains one structure. This is somewhat unfair, as its enantiomorphic twin, $P6_{2}$ #171, contains all of five structures, each of which could just as easily have been measured in $P6_{4}$. If we consider that, then enantiomorphic pairs such as $P4_{3}32$ #212 and $P4_{1}32$ #213 should be higher up in the list but they will still be relatively unpopulated compared to the big hitters.
Let's look at this data in a little more detail. Table 1 shows the distribution of structures by crystal system for the CSD and the ICSD.
Class | # Space Groups | CSD | ICSD | |||
---|---|---|---|---|---|---|
Number | % | Number | % | Number | % | |
Triclinic | 2 | 0.87 | 338,868 | 26.17 | 8,720 | 4.03 |
Monoclinic | 13 | 5.65 | 666,959 | 51.51 | 36,204 | 16.73 |
Orthorhombic | 59 | 25.65 | 218,603 | 16.88 | 45,286 | 20.92 |
Tetragonal | 68 | 29.57 | 28,542 | 2.20 | 33,360 | 15.41 |
Trigonal | 25 | 10.87 | 26,124 | 2.02 | 22,258 | 10.28 |
Hexagonal | 27 | 11.74 | 7,197 | 0.56 | 23,968 | 11.07 |
Cubic | 36 | 15.65 | 8,431 | 0.65 | 46,632 | 21.55 |
We see that the organic solids are mostly (78%) triclinic or monoclinic, while 79% of the inorganics have higher symmetry. There are very few tetragonal, trigonal, hexagonal, or cubic organic crystals.
What about chirality, centrosymmetry, or the lack of either behavior? That's shown in Table 2. Somewhat surprisingly, about 80% of all reported structures, both organic and inorganic, fit into centrosymmetric space groups, even though those comprise only 40% of all groups. The organic systems tend to be more chiral, which is not surprising given the handedness of biological amino acids and sugars, but even there over 80% of all structures cannot be distinguished from their mirror images.
Class | # Space Groups | CSD | ICSD | |||
---|---|---|---|---|---|---|
Number | % | Number | % | Number | % | |
Centrosymmetric | 92 | 40.00 | 1,014,301 | 78.34 | 177,507 | 82.02 |
Achiral | 73 | 31.74 | 69,265 | 5.35 | 27,692 | 12.80 |
Sohncke Class II | 22 | 9.57 | 13,851 | 1.07 | 2,410 | 1.11 |
Sohncke Class III | 43 | 18.70 | 197,307 | 15.24 | 8,819 | 4.07 |
Now let's look at the distribution of structures which are either symmorphic or polar. This these results are shown in Table 3.
Property | # Space Groups | CSD | ICSD | |||
---|---|---|---|---|---|---|
Number | % | Number | % | Number | % | |
Symmorphic | 73 | 31.74 | 384,341 | 29.69 | 84,709 | 39.14 |
Polar | 68 | 29.57 | 163,111 | 12.60 | 20,245 | 9.35 |
Symmorphic & Polar | 21 | 9.13 | 28,121 | 2.17 | 5,961 | 2.75 |
Although symmorphic and polar space groups both comprise about 30% of the total number of groups, polar space groups are very underrepresented, with only 10% of all structures being in polar groups. It's even worse to be polar & symmorphic: while 9% of all space groups fall into this category, less than 3% of all structures are in these groups. This is comparable to the population of Sohncke Class II. Since symmorphic and Sohncke II groups do not overlap, we've found that 65 space groups (28.26%) contain less than 4% of all crystal structures.
Finally, let's look at the heavy-hitters and the candidates for waivers: the space groups with the largest and smallest populations. Start with the popular groups:
CSD | ICSD | ||
---|---|---|---|
Group | Population | Group | Population |
$P2_{1}/c$ #14 | 440,837 | $Pnma$ #62 | 16,424 |
$P\overline{1}$ #2 | 325,946 | $P2_{1}/c$ #14 | 14,711 |
$C2/c$ #15 | 106,626 | $Fd\overline{3}m$ #227 | 11,942 |
$P2_{1}2_{1}2_{1}$ #19 | 90,094 | $Fm\overline{3}m$ #225 | 11,915 |
$P2_{1}$ #4 | 67,053 | $I4/mmm$ #139 | 9,053 |
$Pbca$ #61 | 41,436 | $P6_{3}/mmc$ #194 | 8,769 |
$Pna2_{1}$ #33 | 17,606 | $P\overline{1}$ #2 | 8,113 |
$Cc$ #9 | 13,493 | $C2/c$ #15 | 7,542 |
$P1$ #1 | 12,922 | $C2/m$ #12 | 6,849 |
$Pnma$ #62 | 12,905 | $Pm\overline{3}m$ #221 | 6,379 |
On the organic side the distribution is surprisingly narrow: the first two space groups, $P2_{1}/c$ and $P\overline{1}$ account for 59% off all the entries in the CSD. If we add $C2/c$ and $P2_{1}2_{1}2_{1}$ we account for nearly 3/4 of all the entries, and the top ten account for 87%. Inorganic structures are much more spread out, but even here the top ten space groups contain nearly half (47%) of all the entries in the ICSD.
Table 4 also shows that the triclinic space groups $P1$ and $P\overline{1}$ make the top ten of the CSD and $P\overline{1}$ is number seven in the ICSD ($P1$ is #66). This is not terribly surprising, as it is not difficult scramble a bunch of atoms or molecules. What is more surprising is that the top entry in the CSD, $P2_{1}/c$, has a screw axis, and that group ranks second in the ICSD. Indeed, six of the top ten CSD space groups, including $Pbca$ and $Pnma$ have explicitly named screw operations in their Hermann-Mauguin symbols, while the ICSD has four space groups with screw operations in its top ten, including its top three. We will talk about this more later on, but since it is not exactly clear how to distinguish space groups with screw operations from space groups that have screw axes will save a full discussion for another day.
CSD | ICSD | ||
---|---|---|---|
Group | Population | Group | Population |
$P4_2mc$ #105 | 3 | $P4_{2}22$ #93 | 0 |
$P6mm$ #183 | 4 | $P432$ #207 | 2* (1) |
$P\overline{4}m2$ #115 | 7 | $P422$ #89 | 3* (0) |
$P4_{2}32$ #208 | 8 | $P6$ #168 | 3 |
$P4mm$ #99 | 9 | $P4_{2}cm$ #101 | 4 |
$P4_{2}cm$ #101 | 9 | $P622$ #177 | 4* (3) |
$P\overline{4}2m$ #111 | 9 | $I432$ #211 | 5* (0) |
$P4_{2}22$ #93 | 10 | $P6_{2}$ #171 / $P6_{4}$ #172 | 6 |
$Pmm2$ #25 | 12 | $Pcc2$ #27 | 7* (5) |
$Cmm2$ #35 | 14 | $P4_{1}32$ #210 | 7 |
Table 5 shows that $P4_{2}22$ has no entries in the ICSD. The one $P422$ entry we have in the Encyclopedia was found using the CCDC search engine. Some of the other counts are suspect: the three entries reported in the ICSD for orthorhombic space group $P422$ are actually in tetragonal space group $P4/mmm$, hence $P422$ really has no ICSD entries, and we write its population as 3* (0). We only noticed this because it was first pointed out by Frank Hoffmann as part of his space group list project, but it prompted us to look at all of the bottom ten groups, where we found another nine structures with higher symmetries than shown in the ICSD, and another space group that has no ICSD entries.
While the $P6_{2}$/$P6_{4}$ enantiomorphic pair has only 6 entries in the ICSD, only two other space groups in the CSD ($P4mm$ and $P6mm$) and one in the ICSD ($P6$) are in the dread “enantiomorphic or symmorphic&polar” category. We do find that four of the CSD entries and two of the ICSD entries have explicitly named $4_{2}$ screw operations. This leads us to prepare one more table:
Property | # Space Groups | CSD | ICSD | |||
---|---|---|---|---|---|---|
Named $4_{2}$ Screw | 18 | 7.83 | 3,422 | 0.26 | 3,561 | 1.65 |
Sohncke II or Symmorphic & Polar or $4_{2}$ Screw | 61 | 26.52 | 45,394 | 3.51 | 11,932 | 5.51 |
This table shows that the space groups which:
While the distribution plots in Fig. 1 and Fig. 2 are informative, they only count experiments rather than by structure, and so emphasize structures and compounds which are popular, useful, or have many possible chemical compositions. We would like to find a procedure which eliminates or at least minimizes this bias.
The problem can be seen by looking at space group $Fm\overline{e}m$ #225. The ICSD has 11,915 experimental entries in this group. More than one tenth of these entries (1,393) are reports of monatomic samples in the face-centered cubic (A1) structure. This includes multiple determinations of the lattice constant for 49 elements as well as many measurements on alloys. Another 3,894 entries are for compounds with the NaCl (halite) structure. In other words, 44% of the entries for $Pm\overline{3}m$ are taken up by two structures, significantly skewing the distribution. Similar behavior takes place in other space groups. For example, $Ia\overline{3}d$ #230 has 1,384 entries, but 571 of these are for the $S1_{4}$ form of garnet and 434 are for the Y$_{3}$Al$_{5}$O$_{12}$ form. Obviously the data is biased toward structures are frequently found in nature and that can be formed by many different combinations of elements.
What we'd like is to count unique structures, not thousands of samples of rock salt. Crystallographers have actually addressed this, introducing the concept of structure types (Lima-de-Faria, 1990). Briefly, two compounds belong to the same structure type if they are:
While this categorization into structures is a useful definition it causes some difficulties for the Encyclopedia. Many structure types have include structures with different numbers of elements, a concept foreign to both AFLOW and the Encyclopedia. As an example, the ICSD lists Ho$_{11}$Ge$_{10}$ (AFLOW Label A10B11_tI84_139_dehim_eh2n-001), Tb$_{11}$Si$_{4}$In$_{6}$ (A6B4C11_tI84_139_hm_dei_eh2n-001), and Sc$_{11}$Al$_{2}$Ge$_{8}$ (A2B8C11_tI84_139_h_deim_eh2n-001) under the Ho$_{11}$Ge$_{10}$ structure type. Electronic structure inputs (e.g., VASP POSCAR files) for each of these structures would look very different. One of the capabilities of the Encyclopedia is the generation of such inputs, so it was decided to give break this structure type into three different prototypes. This follows the structure of the AFLOW prototype label (Mehl, 2017; Eckert, 2024) which distinguishes between monatomic, binary, ternary, …, compounds, so that each of the above structures has its own label, as noted.
Ideally we could use AFLOW-XtalFinder (Hicks, 2021) to compare each pair of structures in the ICSD and having the same space group, placing all structures sufficiently close to one another in a unique prototype. In practice this would require millions, if not billions, of calculations, a task well beyond the scope of this brief report. We can, however, determine the AFLOW prototype label for nearly every structure in the ICSD.VI We can then define a unique structural designation by combining the ICSD structure type and the AFLOW prototype label. We then assume that ICSD entries having the same structure type and AFLOW prototype label are in fact in the same prototype. Some ICSD entries do not list a structure type, so we categorize them solely by their AFLOW prototype labels.
This procedure results in a substantial reduction in the number of entries. To use our previous examples, the fcc and NaCl structures are reduced from 1,393 and 3,894, respectively, to one prototype each. Space group $Ia\overline{3}d$ has 1,384 entries in the ICSD but only 22 prototypes.
One caveat to this scheme is that the AFLOW prototype label depends on the alphabetical order of the compounds in the structure. Thus both Cu$_{3}$Au and CuAu$_{3}$ are in the $L1_{2}$ structure, but the former has the label AB3_cP4_221_a_c-001 and the later the label A3B_cP4_221_c_a-001. This means that we will have some problems with duplication. Still we have something much closer to a real index of prototypes with minimal effort. As in example, the ICSD has 1,267 entries in what it calls the “Auricupride#AuCu3” structure. This should reduce to one prototype, but in the quick and dirty scheme described here it has two entries.
Unfortunately we do not have a list of structure types for the entries in the CDC, so we must restrict our study to the ICSD. The results are shown in Fig. 3. It is not that different from the full ICSD plot in Fig. 2. The major change is that the monoclinic space groups $P2_{1}c$ and $C2/c$ and triclinic $P\overline{1}$ have more prototypes than $Pnma$. This is probably because the lower symmetry structures have more possible prototypes for a given set of Wyckoff positions.
What about the prototypes that are actually listed in the Encyclopedia? Their distribution is shown in Fig. 4. At the time of its compilation (January, 2025) there were 2,014 entries, many of them chosen explicitly to populate small occupation space groups, because they had Strukturbericht labels, or because we were interested in the structures for our own research. This compilation bias means that the graph is substantially different from the previous ones. No effort was made to reproduce the distribution found in the ICSD. As the Encyclopedia expands the distribution will approach that of the ICSD, but that point is far in the future.
Table 7 shows the distribution of prototypes and Encyclopedia entries in each space group. Even though space group $Pnma$ is not the most highly populated space group we see that overall there is a substantial shift from higher symmetries into orthorhombic crystals.
Class | # Space Groups | Prototype | Encyclopedia | |||
---|---|---|---|---|---|---|
Number | % | Number | % | |||
Triclinic | 2 | 0.87 | 6,457 | 9.76 | 28 | 1.39 |
Monoclinic | 13 | 5.65 | 19,341 | 29.25 | 329 | 16.34 |
Orthorhombic | 59 | 25.65 | 15,504 | 23.45 | 546 | 27.11 |
Tetragonal | 68 | 29.57 | 7,442 | 11.25 | 370 | 18.37 |
Trigonal | 25 | 10.87 | 6,451 | 9.76 | 243 | 12.07 |
Hexagonal | 27 | 11.74 | 5,126 | 7.75 | 234 | 11.62 |
Cubic | 36 | 15.65 | 5,803 | 8.78 | 264 | 13.11 |
Table 8 shows the distribution of prototypes and Encyclopedia entries by chirality and centrosymmetry. The prototype distribution not much changed from the raw data. The Encyclopedia distribution is skewed toward Sohncke Class II, which has three times as many entries as the raw data (organic or inorganic) and prototype data. This is undoubtedly because we emphasized finding structures in each space group, and the Class II groups are difficult to fill experimentally.
Class | # Space Groups | Prototype | Encyclopedia | |||
---|---|---|---|---|---|---|
Number | % | Number | % | Number | % | |
Centrosymmetric | 92 | 40.00 | 50,903 | 76.98 | 1,385 | 68.77 |
Achiral | 73 | 31.74 | 9,567 | 14.47 | 388 | 19.27 |
Sohncke Class II | 22 | 9.57 | 897 | 1.36 | 66 | 3.28 |
Sohncke Class III | 43 | 18.70 | 4,760 | 7.20 | 175 | 8.69 |
Table 9 shows the distribution of structures for polar, symmorphic, and named $4_{2}$ screw operations space groups, combining Table 3 and Table 6. Both the prototype and Encyclopedia entries show an increased number of polar systems than the raw ICSD data, but the fraction of systems with screw axes are very similar to what we find in the raw data, with just over 30% of all systems having screw axes.
Property | # Space Groups | Prototype | Encyclopedia | |||
---|---|---|---|---|---|---|
Number | % | Number | % | Number | % | |
Symmorphic | 73 | 31.74 | 24,321 | 36.78 | 708 | 35.15 |
Polar | 68 | 29.57 | 9,275 | 14.03 | 316 | 15.69 |
Symmorphic & Polar | 21 | 9.13 | 2,918 | 4.41 | 100 | 4.97 |
Named $4_{2}$ Screw | 18 | 7.83 | 4,662 | 1.28 | 63 | 3.13 |
Sohncke II or Symmorphic & Polar or $4_{2}$ Screw | 61 | 26.52 | 4,662 | 7.05 | 228 | 11.32 |
Finally, let's look at the space groups with the largest and smallest populations as a function of our prototypes and the Encyclopedia entries. The Prototype list is not much changed from the full ICSD list. Two of the new additions, $P2_{1}/m$ and $Cmcm$, have a screw axis, so now five of the top ten have screw axes.
Prototypes | Encyclopedia | ||
---|---|---|---|
Group | Population | Group | Population |
$P2_{1}/c$ #14 | 7,825 | $Pnma$ #62 | 105 |
$P\overline{1}$ #2 | 5,993 | $P2_{1}/c$ #14 | 93 |
$C2/c$ #15 | 3,833 | $C2/m$ #12 | 76 |
$Pnma$ #62 | 3,366 | $P6_{3}/mmc$ #194 | 74 |
$C2/m$ #12 | 3,278 | $C2/c$ #15 | 70 |
$R\overline{3}m$ #166 | 1,461 | $R\overline{3}m$ #166 | 69 |
$P6_{3}/mmc$ #194 | 1,457 | $Cmcm$ #63 | 56 |
$P2_{1}/m$ #11 | 1,315 | $I4/mmm$ #139 | 50 |
$Cmcm$ #63 | 1,254 | $P4/mmm$ #123 | 29 |
$I4/mmm$ #139 | 1162 | $P\overline{3}m1$ #164 | 29 |
Prototypes | Encyclopedia | ||
---|---|---|---|
Group | Population | Group | Population |
$P4_{2}22$ #93 | 0 | $I432$ #211 | 1 |
$P432$ #207 | 2* (1) | $P31m$ #157 | 1 |
$P422$ #89 | 3* (0) | $P4_{2}/nnm$ #134 | 1 |
$P4cc$ #103 | 3 | $P\overline{4}b2$ #117 | 1 |
$P6$ #168 | 3 | $P\overline{4}2c$ #112 | 1 |
$I432$ #211 | 5* (0) | $I4_{1}cd$ #110 | 1 |
$P4_{2}cm$ #101 | 4 | $P4_{2}bc$ #106 | 1 |
$P622$ #177 | 4 | $P4_{2}mc$ #105 | 1 |
$P6_{2}$ #171/$P6_{4}$ #172 | 5 | $P\overline{4}$ #81 | 1 |
$Pcc2$ #27 | 7* (5) | $I4_{1}$ #80 | 1 |
The prototype list has some of the usual suspects: two space groups with a $4_{2}$ screw operation and one enantiomorphic pair. The Encyclopedia entries are a little different: currently the Encyclopedia has 22 space groups with only one member. For Table 11 we arbitrarily list the ten with the largest space group number. The enantiomorphic pairs do not show up in the Encyclopedia list as we made an explicit effort to find at least one structure in every space group, so the total count for a pair is at least two. Only two pairs fall into this category: $P4_{1}22$/$P4_{3}22$ and $P6_{2}$/$P6_{4}$, with the other pairs having at least four entries between them.
As noted in Table 4, over one-third of the structures in the CDC are in space group $P2_{1}/c$, and another 14% are in space groups $P2_{1}2_{1}2_{1}$, $P2_{1}$, $Pna2_{1}$, or $Pnma$, all of which have screw operations mentioned in their Hermann-Mauguin symbols. The ICSD is not as biased, but the top three entries, $Pnma$, $P2_{1}/c$, and $Fd\overline{3}m$ all have screw operations. We might expect to find screw axes in organic systems, as they are often composed of molecules, and the twisting allowed by a screw axis may make it easier to form a compact structure (Cockcroft, 2016).
In fact, organic systems systems seem to prefer multiple screw axes, if possible. To see this, consider the first four monoclinic space groups: $P222$ #16, $P222_{1}$ #17, $P22_{1}2_{1}$ #18, and $P2_{1}2_{1}2_{1}$, #19. As we can see from the space group notation, these have zero, one, two, and three screw axes respectively. Otherwise they are quite similar: they are all in class Sohncke III, allowing chiral structures, they are non-centrosymmetric and non-polar. As we can see in Fig. 5, the number of structures more or less exponentially with the number of screw axes.
We see similar behavior in higher symmetry structures. Space group $Pnma$ #62 is the most populated entry in the ICSD and the most populated higher symmetry structure in our prototype list. $Pbca$ #61 has more organic entries than any other space group beyond #19, and it also has three screw axes. Obviously nature favors space groups with screw axes, the more the better.
From our figures, we see that $P2_{1}2_{1}2_{1}$ is the most heavily populated non-centrosymmetric space group in all distributions, including 32% of all the non-centrosymmetric entries in the CDC. It appears that if you must be a non-centrosymmetric crystal with a screw axis, you want to have as many screw axes as possible.
Finally, let's have a little fun. Benford's Law states that if take our data, count the number of entries in a given space group, and then ask how many of those space groups have a count beginning with the digit $d$, those counts will be distributed along a logarithmic scale, with the probability of an entry beginning with the digit $d$ proportional to
In Fig. 6 we plot the number of entries in each space group that begin with the digit 1, 2, 3, etc. That is, if we look at the data back in Fig. 1, 78 of the space groups have a population beginning with 1, from 101,754 in $C2/c$ down to 14 in $Cmm2$. There are 34 groups with a count starting with 2, 25 with 3, and so on. The solid line shows the ideal distribution (1), normalized the the first CSD entry.
Benford's Law works best when the data is distributed across many orders of magnitude, which is certainly true for the raw CDC and ICSD data, and are more or less distributed uniformly across their range. The reader will have to judge if this last statement is true from a study of Figures 1-4.
So what have we found? As we noted at the beginning, the details of structural distribution will depend on what structures are being studied. However, we can make some general statements:
We would like to thank Prof. Harold Stokes for enumerating all of the screw axes in the space group tables, not just the ones explicitly named in the Hermann-Mauguin symbols for the space groups.
I If you have a hallway in your department that needs pictures, check out the poster version, which shows a structure from each of the 230 space groups.
II We have used a number of structures from this list for the Encyclopedia.
III For historical reasons the Encyclopedia has two entries for this structure, with and without the hydrogen atoms.
IV The current edition of the ICSD has three entries that are claimed to be in space group $P422$ #89, but all three are actually tetragonal. AFLOW places them in $P4/mmm$ #123. Hoffmann reported that the CSD has 9 entries, but 6 have no coordinates and 2 are incorrectly assigned, leaving only (CH)17FeO4Pt, technically known as tetrakis(μ2-acetato)-tetrakis(μ2- ferrocenecarboxylato)-tetra-platinum toluene solvate. (The current CSD lists 18 entries, but we have not been able to analyze the new entries to see if they are indeed in $P422$.)
V At the moment we are only counting the 20 space groups with $4_{2}$ operations in the Hermann-Mauguin symbol. This omits 12 space groups which have $4_{2}$ screw axes that are not so labeled. Ten of those groups are also sparsely populated and so fit into this category. The other two, $I4/mmm$ #139 and $Pm\overline{3}m$ #225, contain over 20,000 ICSD entries between them. Rather than hand-wave as to why these two groups should be neglected we will simply leave all 12 out of the calculation. As partial justification, we note that none of these twelve groups has a $4_{2}$ operation its group elements, and four of them, including $I4/mmm$ and $Fm\overline{3}m$, are symmorphic.
VI There several caveats to this statement:
This is a list of resources mentioned in the text: