Open Access

Assortative preferences in choice of major

   | Sep 16, 2020

Cite

Introduction

Evidence shows that expectations about earnings, employment opportunities, marriage options, job–family balance, enjoying course work, social status of available jobs, and own ability to successfully complete the study associated with each major are fundamental factors in the choice of field of study. The evidence also shows that there is a substantial error in beliefs (subjective expectations) about the population values of these determinants (Stinebrickner and Stinebrickner, 2014). When students are provided correct information, they update their beliefs and their choice of field of study (Wiswall and Zafar, 2015; Arcidiacono et al. 2012).

According to the 2013 National Graduate Survey, close to 60% of university graduates in Canada report that their parents’ recommendations played a very important role in their choice of major. This is not surprising because the information and its value from different sources become more dispersed and questionable. Altonji et al. (2015), for instance, documented that Princeton pushes students to consider departments with fewer students. Some postsecondary institutions prefer a distribution of students across majors in such a way that it correlates with the distribution of faculty members in those majors. Departments in high (low) demand make their own field of study less (more) attractive when counseling students in their choice of major. As complex education choices are made under uncertainty about the achievement of choice-specific outcomes and personal preferences and abilities, parents become the least costly and most trustworthy channels of information, especially in Canada, where switching majors is not costless.

As expected, studies (e.g., Hoxby and Avery, 2013) show that less well-educated parents with no specialization would not be good transmitters of information.

Although the system is different from one where the major is chosen at entry into the university through a centralized test or using a threshold grade point average (GPA) required for each major, students in Canada are usually accepted to universities at three main faculty levels: Arts, Science, and Business/Commerce. Each of these requires different courses to be completed in high school with competitive GPAs at grades 11 and 12. Therefore, although the majors are decided after the second year, roughly after completing 14–16 core courses within each faculty, switching majors across faculties imposes a significant cost on students and parents.

Yet, a significant assortativity (a child predictably becomes a teacher because it is his father’s and/or mother’s job) could also suggest systemic biases in decision-making, specially when the information about the achievement of the future major-specific outcomes is bounded by parents’ fields of study.

In addition to information asymmetry, parents could also impose their preferences on their child’s educational attainment by their willingness to use financial transfers to “distort” their child’s choice toward (or against) a specific field of study. Zafar (2012) investigates this issue in his recent paper titled, “Double majors: one for me, one for the parents?”

What then is the parents’ role in the belief formation? To understand that information is not distributed symmetrically across majors with the same value and volume, imagine that both parents are accountants and working in the finance industry. The cost of obtaining the same level of information about other majors, say on biochemistry, is obvious. This brings us to the question of how the field-of-study homogamy (FSH) and whether the parents work in related occupations affect the magnitude of information asymmetry. The following 2 empirical questions need to be answered to assess the role of information asymmetry in the choice of major more formally: How can we quantify the resemblance of fields of study between parents and children beyond a binary proposition that reflects the assortative tendency, an association that exposes the attraction of each child to their own parents’ majors? How can we identify the role of information asymmetry in this assortative tendency, after removing the other factors that are not observed by the researcher, such as implicit randomness, ability sorting, and individual tastes, embedded in the resemblance of majors between each parent and child? We will try to answer both questions in this paper.

This study’s primary objective is to investigate the role of information asymmetry in children’s attraction to their parents’ field of study reflected by assortative tendencies in child–parent matches. We apply conventional intergenerational transmission functions that relate the children’s assortative preferences to FSH and whether parents work in their trained jobs within Canadian families. We use the confidential major file of the 2011 National Household Survey (NHS) so that the size of the data and the availability of different levels of aggregation in the Classification of Instructional Programs (CIP) allow us to develop 3 indicators: the degree of children’s attraction to their parents’ field of study (field of study attraction or FSA), the degree of FSH, and the degree of relatedness between each parent’s field of study and occupation (field-of-study relatedness or FOR). To identify the role of information asymmetry in assortative patterns, we define quasi-likelihood transmission functions, where the response variables take on fractional values of FSA between each child (son/daughter) and parent (father/mother) as a function of FSH and FOR. Similar to the difficulties in identifying the role of expected earnings in college major choice, the challenge here is also to control for selection into each major. To tackle this problem, we define within-family transmission functions based on an assortative matching model with 1-to-2 matches (1 for each parent), inspired by Diamond and Agarwal (2016). Comparable to panel models, this allows us to reduce unobserved heterogeneity so that the results provide new and more direct evidence about the intergenerational association of field of study due to information asymmetry reflected in assortative tendencies, which is, to the best of our knowledge, the first of its kind in the literature.

The first part of our results shows that children’s choice of field of study exhibits significant assortative preferences. This finding is a new contribution that reports intergenerational skill transfers as opposed to educational mobility. We also find that the assortative tendency is the highest between fathers and sons relative to all other pairs, namely, father–daughter, mother–son, and mother–daughter. This evidence becomes even stronger when we use more disaggregated CIP codes and control for educational degrees. A significant skill sorting in mating is also revealed by the FSH measures, which also indicate gender differences in the attractiveness of each major in marriage. This finding is consistent with the evidence that the gain from the marriage could be different for each spouse (Choo and Siow, 2006) and with the evidence of a substantial degree of gender heterogeneity in the preferences for each major (Wiswall and Zafar, 2015). These findings, on significant intergenerational skill transfers and greater assortative mating for skills, are in line with the concerns about the possible progressive skill stratifications and earning inequalities in societies.

In the second part, the estimation results show that higher assortativity in each child–parent combination is strongly associated with greater homogamy and field-of-study relatedness in parents’ jobs. The empirical approach that we apply here aims to identify the role of information boundaries in this relationship. Our findings indicate that asymmetric information is a significant contributor to children’s assortative tendencies in their choice of major.

The remainder of the paper is organized as follows: Section 2 introduces the data, homogamy, assortative preferences, and occupational relatedness; Section 3 introduces the conceptual background that links the subjective expectations in choice of major to information entropy; the empirical framework is explained in Section 4; the estimation results are reported in Section 5; and we provide the concluding remarks in Section 6.

Data, assortative preferences, homogamy, and occupation match
Data

This study uses the confidential major file of the 2011 NHS. We restricted the data to include only non-Aboriginal native-born individuals living in 10 provinces. We also dropped non-degree-holder parents (i.e., those with no education or an education degree that does not grant a major) and those whose field of study contains <10 workers. After these restrictions, we obtained about 2.3 million observations. The 2011 NHS enables the classification of individuals’ major field of study in which the highest postsecondary certificate, diploma, or degree was granted. Statistics Canada classifies the major fields of study by using the CIP, which includes 1,688 instructional program classes.

For more information on CIP classification, see www.statcan.gc.ca/concepts/classification-eng.htm The most aggregated level classifies CIP codes into 12 major groups. This aggregation is reduced to 41 and 372 groups and is classified down to the most detailed level, where all majors are presented with 1,688 CIP codes.

One major challenge in identifying children’s choice of field of study in relation to their parents’ educational background is the availability of data. There is no survey in Canada in which respondents are directly asked about their parents’ field of study. Although parents’ schooling years are more accessible, many studies on educational transmission face the same challenge. In a recent study, e.g., Chevalier et al. (2013) use a subsample from a pool of Labour Force Surveys in the UK, which include children aged 16–18 years and living at home, so the parental information can be matched to the child’s record. In order to identify field-of-study resemblance between parents and children, we use the same approach and create a subsample that is composed of children living at home. Although this restriction reduces the total sample size, it becomes less severe for the comparable age groups between 16 and 25 years of age. For example, while there are 122,000 females with an identified field of study between the ages of 19 and 21 years in the whole sample, our subsample includes 26,000 children who live with their parents. Moreover, we use this subsample only for FSA calculations, while the indices for parental homogamy and occupational relatedness use the full sample. We are aware that using a subsample of observations raises a question of selectivity. To ensure that the final sample is representative of the population, we first compare the distribution of parents (fathers and mothers, separately) living with their children to that of the whole sample across CIP codes classified into 12 and 41 groups based on 5-year age classes. We applied the same comparison for children based on gender and age. The results seem to confirm that the distribution of children and parents across fields of study by age and gender in our restricted subsample mirrors the same distributions in the full sample.

Hilger (2016) has developed a new method to adjust the data to recover the outcomes of “missing” independent children. However, their educational outcome is measured in terms of years of schooling. We have also applied the inverse probability weights method to our subsample to address the possible selectivity problem. The results on FSA calculations do not change significantly.

Although we are forced to study only those children who live with their parents, this issue has to be well thought out. Our sample’s distributional representativeness of population by age and gender is just one aspect of the matter. It is possible, for instance, that those individuals who live at home are more likely to attend their closest higher education institution and thus their program choice set would be limited by a regional concentration in a particular industry, in which the parents work. Although we address this problem by controlling for unobserved regional heterogeneity, another issue would be that children who have a good relationship with their parents are more likely to stay at home and may therefore hold their parents as likely role models and follow in their footsteps. If this is the case, using our sample would lead to an upward bias in the parent–child assortativeness of field of study.

Gratefully, this point has been brought to our attention by one of our referees. Another point raised by the referee is that the financial crisis before the 2011 Census is likely to have an impact on choices of field of study. Issues like oversensitivity to uncertainty and a higher level of risk aversion may have strong effects on the choice set of majors for the cohort of students that we investigate in our study.

More descriptive information about the data and our samples are provided in the following sections, which explain FSH, FSA, and FOR.

FSH measures

Assortative mating has long been documented by demographers using nonparametric log–linear models based on contingency tables of ethnicity, education, religion, and other attributes (Schwartz, 2013). Following Becker’s (1973, 1974) theory on marriage markets, economists have also investigated assortative mating in relation to match gains and returns to marriage. Chiappori and Salanie (2016), for instance, show that educational homogamy of posterity is likely to be reinforced by increases in the human capital of parents, who are matched homogamously themselves. Bicakova and Jurajda (2016) are the first to analyze mating by field of study for European countries.

Unlike joint or conditional probabilities that define the likelihood of a match, we choose the following identity that recognizes the randomness inherent in the matching process and specifies to what extent the match is driven by assortative mating on the field of study and to what extent it reflects the marginal distributions of each major:

FSH=Pr(M)[Pr(F|M)Pr(F)],$$\text{FSH}=\Pr \left( M \right)\left[ \Pr \left( \left. F \right|M \right)-\Pr \left( F \right) \right],$$

where F and M are indicators of fields of study for the female and male mates, respectively, in matching couples. As recognized in the literature, the observed matches in a marriage market are jointly determined by the preferences of both partners. For example, Choo and Siow (2006) argue that the observed marriage patterns positively depend on the gross gains to marriage in which the individual returns could be different for each spouse, reflecting a spousal “appreciation” or “attraction” of each field of study in mating.

With the number of matches, mij, where i and j reflect the husband’s and wife’s major in each row and column in the resulting contingency table, the FSH matrix can be calculated by Equation (2):

mijTmimjT2,$$\frac{{{m}_{ij}}}{T}-\frac{{{m}_{i}}{{m}_{j}}}{{{T}^{2}}},$$

where mi and mj represent the row and the column totals, respectively, and T is the total number of pairs. While the FSH matrix reveals the assortativeness between, e.g., a male accountant and a female historian in mating, it would be quite possible that a male accountant’s attraction to a female historian would be different from her attraction to a male accountant. In order to reflect a differential “appreciation” of each field of study for each spouse in mating, we use a simple horizontal (vertical) normalization for each row (column) of the FSH matrix between 0 and 1. The results are reported in Table 1.

Field-of-study homogamy (FSH) by prime CIP codes (weighted)

Wife’s major
1234567891011
Husband’s majorNormalized by husband’s major
Education11.000.250.270.190.000.270.260.250.260.180.19
Arts20.181.000.420.420.000.290.310.290.300.010.28
Humanities30.430.431.000.420.000.380.360.320.350.140.25
Law40.370.360.471.000.000.340.320.280.310.010.17
Business50.230.260.300.331.000.280.270.220.230.000.11
Science60.470.440.580.430.001.000.850.420.470.340.26
Math/Comp.70.250.510.510.610.390.501.000.430.420.000.30
Engineering80.000.350.170.171.000.350.450.660.440.700.79
Agriculture90.300.190.090.160.000.300.190.221.000.360.21
Health100.280.290.300.260.000.340.290.290.311.000.25
Services110.000.320.180.250.540.280.360.350.380.631.00
Normalized by wife’s major
Education11.000.290.500.290.000.370.010.070.140.180.10
Arts20.371.000.570.470.230.410.150.180.230.200.26
Humanities30.430.481.000.460.140.450.150.130.200.170.18
Law40.420.460.671.000.090.470.120.110.200.040.08
Business50.370.310.530.470.970.380.040.000.000.000.00
Science60.410.390.590.420.201.000.870.200.300.280.19
Math/Comp.70.370.440.550.480.320.481.000.200.230.170.22
Engineering80.000.000.000.001.000.000.131.000.160.681.00
Agriculture90.410.350.470.400.270.450.070.181.000.340.26
Health100.380.310.490.360.050.460.000.120.171.000.17
Services110.300.330.420.370.390.300.110.170.230.420.68

Notes: The sample used in this table contains all working spouses, regardless of whether they have children with or without an identified CIP code. Because of the very few observations, the table does not report “Others” classified under CIP code 12. The details of majors are as follows: (1) Education, (2) Visual and performing arts, and communication technologies, (3) Humanities, (4) Social and behavioral sciences and law, (5) Business, management and public administration, (6) Physical and life sciences and technologies, (7) Mathematics, computer and information sciences, (8) Architecture, engineering, and related technologies, (9) Agriculture, and natural sources and conservation, (10) Health and related fields, and (11) Personal, protective, and transportation services.

For example, the match between a male accountant (Business) and a female historian (Humanities) is ranked at 0.30 in terms of its assortativity, among all possible matches available for a male accountant with other different major holders. The same match is ranked at 0.53 among those available for a female historian reported in the bottom section of the table. These indices simply order each partner’s appeal by his/her field of study and do not impose cardinal restrictions. It is obvious from the diagonal of both sections of the table that the evidence supports a strong FSH. Although not reported here, FSH becomes even stronger when we use 41 CIP codes matrix calculated based on 12 and 41 major CIP codes. The results indicate a slightly increasing FSH as we use more detailed CIP codes.

We use H-index (explained in the next section) to compare two FSH contingency tables. The index provides the ratio of the actual share of matches with the same field of study (on the diagonal) to the share of matches that one would expect under the random matching assumption.

Quantifying assortative preferences: FSA

The FSA index compares the field of study of each parent to that of each child in a family and calculates the degree of attraction between the 2 based on the probability distributions. We create 4 contingency tables using the restricted subsample explained earlier. Each table reports the number of field-of-study matches between sons and fathers, daughters and fathers, sons and mothers, and daughters and mothers. Similar to Equation (1), to identify the observed matching patterns, we choose the following identity that reflects the differences between observed and expected frequencies under independence:

FSA=Pr(P)[Pr(K|P)Pr(K)]$$\text{FSA}=\Pr \left( P \right)\left[ \Pr \left( \left. K \right|P \right)-\Pr \left( K \right) \right]$$

where P and K are indicators of fields of study for parents and children in matching, respectively. When it is normalized for each parental field of study between 0 and 1, the resulting measures imply the attraction of children to their parents’ majors evaluated by the observed distribution of all possible matches between parents and children. The number of different matching possibilities between the parent and the child comes from the fact that it is the child who faces many different alternatives before making a decision on a major.

The assortativity exposed by FSA reflects only the child’s preferences as they are defined over children, not over parents in matches. While we use 12, 41, and 137 major groups of CIP, in the 4 match tables, we report only the sons’ match calculated with 12 major CIP codes in Table 2. The higher values of FSA on the diagonal indicate that the most likely matches happen between the same fields of study. In each row, for any given major that the parent holds, the normalized FSA indicates the son’s attraction to all other majors relative to the most likely match. The premise of this measure is that the child’s attraction to each parent’s major could be different even if the parents have the same field of study. Intuitively, the same major could be more(or less) attractive for the son, e.g., if it is held by his father, which may reflect not only the differences between maternal and paternal influence but also gender differences in occupational distributions. While dissimilarities in each cell between the upper and lower parts of the table may expose this fact, the presence of a strong assortativity indicates that parents’ field of study is a fundamental factor in children’s choice of field of study.

Field-of-study homogamy (FSH) by prime CIP codes (weighted)

Son’s major
Father’s major1234567891011
Education10.940.801.000.810.800.740.740.000.700.880.63
Arts20.321.000.370.350.310.360.320.000.210.350.38
Humanities30.600.721.000.730.510.590.570.000.560.590.48
Law40.650.720.871.000.770.680.660.000.600.680.53
Business50.550.540.640.691.000.610.570.000.500.560.43
Science60.640.630.670.730.601.000.690.000.510.630.35
Math/Comp.70.640.760.880.630.730.981.000.000.520.640.54
Engineering80.170.140.000.020.000.080.181.000.200.140.21
Agriculture90.230.100.010.110.020.200.160.001.000.260.51
Health100.690.660.760.750.631.000.570.000.570.840.59
Services110.250.210.080.060.000.120.250.510.240.231.00
Mother’s major
Education10.980.840.941.000.860.840.770.000.770.850.46
Arts20.381.000.650.620.400.450.420.000.370.420.15
Humanities30.620.711.000.780.700.740.670.000.570.580.46
Law40.520.590.691.000.750.530.470.000.550.540.47
Business50.210.140.060.000.370.170.291.000.250.160.38
Science60.720.640.700.660.751.000.700.000.600.640.57
Math/Comp.70.390.440.510.380.000.961.000.450.300.200.05
Engineering80.140.190.100.090.000.270.231.000.170.330.25
Agriculture90.390.430.270.400.000.420.320.691.000.340.42
Health100.450.180.000.070.020.420.340.840.571.000.96
Services110.210.270.060.000.020.210.261.000.280.220.60

Notes: See the notes to Table 1 for the full description of majors.

Although we refrain from using more space to interpret the results here, one may ask to compare the extent of field-of-study attraction between parents and children across 4 match tables. We use an index that computes the ratio of 2 diagonal shares of a match matrix as follows:

H=100(mij/Tmimj/T21)$$H=100\left( \frac{\sum {{{m}_{ij}}}/{T}\;}{\sum {{{m}_{i}}{{m}_{j}}}/{{{T}^{2}}}\;}-1 \right)$$

which is the sum of the joint probabilities on the diagonal relative to the sum of the products of their marginal probabilities. Hence, it provides the ratio of the actual share of matches with the same field of study (on the diagonal) to the share of matches that one would expect under the random matching assumption.

When the children’s choice of major is not affected by their parents’ field of study, each joint probability on the diagonal (nominator) approaches the product of its marginal probabilities; then the whole index becomes zero. Thus, any departure from zero indicates the tendency toward the same field-of-study matches.

The indices calculated for the 41 major CIP codes are as follows: 119.56 for Father–Son, 31.28 for Mother–Son, 60.02 for Father–Daughter, and 48.88 for Mother–Daughter. These sharp differences, tested by 95% bootstrapped confidence intervals, indicate that a randomly picked father–son pair with the same field of study is about twice as likely than would be predicted under random matching. Moreover, a very low index for mother–son pairs suggests that the overall attraction of sons to their mother’s major is slightly higher than what would be predicted if sons randomly pick their majors. Although these observations are very informative, they would not provide answers that explain the underlying reasons. In Section 3, we will attempt to confront this challenge.

Field-of-study occupation relatedness - FOR

The evidence shows that when people do not work in their trained jobs, the value of their field of study diminishes (Aydede and Dar, 2016; Robst, 2007). A recent study by Lemieux (2014) finds that this wage penalty varies by each field of study in the range of 16% for engineers and 5.7% for degree holders in the Humanities. The quality of parents’ occupational match would also contribute to the formation of subjective expectations about the major-specific outcomes. An accountant working as a chef, for instance, would be a less-reliable channel of information on the prospects of an accounting major than one who works as a certified public accountant.

To measure FOR beyond a binary proposition, related or not, we use the following continuous index suggested by Aydede and Dar (2016):

FORof=Lof/LfLo/LT$$\text{FO}{{\text{R}}_{\text{of}}}=\frac{{{{L}_{of}}}/{{{L}_{f}}}\;}{{{{L}_{o}}}/{{{L}_{T}}}\;}$$

where L is the number of workers, o is the occupation, f is the field of study, and T denotes the whole workforce. Given the large sample at our disposal, we use the frequency distribution of 41 fields of study across 40 occupations classified according to the National Occupational Classification (NOC–2011), which gives us 1,640 cells to calculate FOR. For each of the 41 fields of study, when we normalize FOR between 0 and 1 by using the highest FOR as numeraire, the resulting index, NFOR, reveals the ranking of each occupation for each major based on the native-born workers’ distribution. To provide a descriptive summary for FOR, we classify the NFOR in 2 class intervals (1–0.8 and 0.8–0.0) and report the distribution of spouses across these classes and 11 major fields of study in Table 3. If, for any given field of study, we consider the occupations with NFOR between 1.0 and 0.8 as relatively better-matching occupations, we see that 32% of husbands work in related occupations, with the same ratio slightly lower for wives. As expected, the ratio varies across majors from 10% for wives in humanities to 57% for husbands in education.

Distribution of fathers and mothers by NFOR and 12 prime CIP codes (% and weighted)

FatherMother
NFORMajor’sNFORMajor’s
Majors1.0–0.80.8–0.0share1.0–0.80.8–0.0share
Education157.3242.687.4156.8043.208.74
Arts228.5071.503.4428.3371.673.55
Humanities311.0388.974.8610.0589.955.16
Law423.8976.119.6223.5176.4911.15
Business516.3483.6620.0214.9885.0222.94
Science626.5373.473.3326.2873.723.21
Math/Comp.732.1167.893.6129.0170.993.27
Engineering842.9957.0126.2642.1057.9016.79
Agriculture925.3974.612.9323.5176.492.55
Health1036.1563.8511.9133.2266.7816.08
Services1136.9163.096.6134.5865.426.58
Total32.1567.8529.6270.38

Notes: (1) See the notes to Table 1 for the full description of majors. (2) The sample used in this table contains all working spouses, regardless of whether they have children with or without an identified CIP code.

Finally, to see the relationship between parents’ education–job relatedness and children’s attraction to their parents’ field of study, we summarize the FSA for each child–parent pair by the parents’ occupational relatedness. In the first row, both father (F) and mother (M) work in occupations that are related to their majors. This relatedness is reflected with a binary variable, NFORC, which is 1 if NFOR is between 1 and 0.2; and 0, otherwise. Although this classification is arbitrary, it seems that, in all parent–child pairs, a higher FSA is associated with a greater NFOR. More interestingly, the highest average FSA in each column is observed when the matching parent works in a related job irrespective of the other parent’s occupational relatedness.

For example, in the first column of Table 4, the average FSA is much higher (0.466 and 0.469) when the father’s NFOR is 1 and not affected by the mother’s field-of-study relatedness. This observation recurring in each column implies that the FSA calculated for each child–parent pair is strongly related to the matching parent’s occupational relatedness but not to that of the other parent. If this positive relationship is statistically meaningful, which we investigate in the following sections, it also implies that FSA indices properly retrieve parental differences in assortativity.

Average FSA by NFOR based on 41 major CIP codes (weighted)

NFORCFather–sonMother–sonFather–daughterMother–daughter
F = 1, M = 10.4660.5340.5250.445
F = 1, M = 20.4690.5150.5270.425
F = 2, M = 10.4180.5350.5020.443
F = 2, M = 20.4170.5170.4920.428

Note: The sample used in this table contains all working spouses regardless of whether they have children with or without an identified CIP code.

Conceptual background

Although theoretical work incorporates the uncertainty in schooling decisions, earlier empirical studies assume that individuals are rational and use the achieved (observed) outcomes to infer decision rules. The recent literature shows that this is not a valid assumption and the difference between beliefs on choice-specific outcomes and their true population values is not trivial. A few recent studies (Wiswall and Zafar, 2015; Zafar, 2012, 2013; Arcidiacono et al., 2012; Stinebrickner and Stinebrickner, 2014) address this identification problem by directly eliciting subjective beliefs from a sample of university students. While the evidence in these studies reveals that subjective expectations on major-specific outcomes greatly vary across individuals, there is a lack of evidence as to why beliefs are so dispersed around the true population values.

In this study, we want to understand the role of parents’ educational background in the process of expectation formation by looking at the assortative preferences that result from asymmetric information. The main driver of child i’s attraction to major m revealed in his/ her choice is the expected lifetime utility from the vector of future outcomes (Z) of a specific human capital endowment with the subjective joint probability distribution, G(Z|m,t), at time t, defined as follows:

EiVi,m=t=1Tβt1U(Z)dGi(Z|m,t)$${{E}_{i}}{{V}_{i,m}}=\sum\limits_{t=1}^{T}{{{\beta }^{t-1}}}\int{U\left( \text{Z} \right)d{{G}_{i}}\left( \left. \text{Z} \right|m,t \right)}$$

This implies that the appeal of major m would be different from that of major k for individual i due to differences in beliefs reflected in the subjective joint probability distributions G(Z|m,t) and G(Z|k,t), even if the majors have identical distributions in terms of their observed outcomes, F(Z|m,t) =F(Z|k,t). What makes the uncertainty on the same major different for each individual? Or what makes the uncertainty different for each major for the same individual?

The concept of information entropy in computer science, introduced by Shannon (1948) and used in economics by Sims (2003), argues that people have limited information-processing capacity, which alters the information for each individual and thus differentiates their behaviors. The “fundamental problem of communication” is for the “receiver” (user of the information) to be able to identify what data were generated by the “source”, based on the signal it receives through the (potentially noisy) “channel”. Sim’s “noisy information model” provides a convenient framework in our context because the information flow is modeled by the discrepancy in probability distributions of the same event at the source and the receiver.

In our context, parents serve as communication channels, not as the source of data, in transmitting publicly available information on choice-specific outcomes to their child, the “receiver”. The parents’ capacity (the level of complexity in their communication and the amount of time for them to convey the data) will be determined and bounded by their own majors. To understand the differences in this capacity and related entropy, one can imagine a biochemist father obtaining, carrying, and sustaining the information on possible outcomes of choosing nuclear physics or accounting as opposed to a father who is a nuclear physicist or an accountant.

The channel capacity, which reflects the information entropy on major m defined by the Kullback–Leibler (DKL) divergence (the difference between the subjective and objective probability distributions of the future outcomes, Z), can be expressed as follows when the father’s field of study (FOS) is set to m:

Ei(DKL|m,FOSF=m,t)=γ+αFSHM+δFORF+βFORMFSHM+ρz,$${{E}_{i}}\left( \left. {{D}_{\text{KL}}} \right|m,\text{FO}{{\text{S}}^{\text{F}}}=m,t \right)=\gamma +\alpha \text{FS}{{\text{H}}^{\text{M}}}+\delta \text{FO}{{\text{R}}^{\text{F}}}+\beta \text{FO}{{\text{R}}^{\text{M}}}\text{FS}{{\text{H}}^{\text{M}}}+\rho z,$$

where superscripts F and M denote father and mother, respectively. Equation (7) implies that when FOSF=m, the expected level of information received by the child on major m is equal to an index number, g, the father’s level of information-processing capacity on his own major plus how compatible the mother’s major is with the father’s major (FSMM), and the degree of relatedness between the parents’ fields of study and their occupations (FOR). The key element in this expression is FSMM, which reflects the degree of relatedness (normalized between 0 and 1) between the fields of study of the parents. Suppose that the mother’s major is the least-related major to her husband’s major (FSMM= 0). It implies that she is not a “high capacity” channel for the information on major m but becomes one on her own major. Hence, a higher degree of field-of-study resemblance between parents makes them more efficient channels (less noisy) for more reliable information on major m by decreasing information entropy. However, a greater homogamy also means that parents become less efficient channels for other majors, with rising relative entropy. Therefore, the level of FSH defines the level of information asymmetry in a family.

It could be argued that parents are not the only channels in accessing the information on majors. We assume that the information obtained from all other channels (child’s peers, councillors in his/her school, his/her close relatives, and the parents of his/her best friends) that a child would receive would be filtered through parents. This assumption is in line with the evidence that parental approval is the most important factor in the choice of major (Zafar, 2012). However, this assumption is not required in our empirical setting, as will be evident later.

This example becomes less intuitive when we compare 2 cases where the father is an accountant in both cases but the mother is a biochemist in the first case and a historian in the second. How different would the parents be in terms of channeling reliable information on accounting? Although the values of FSMM would be different, it appears that these 2 cases should be the same in terms of available information on accounting, especially relative to the case where both parents are accountants. However, one has to think that the information entropy on a major in a family will be determined by not only the fact that it is the major of one of the parents, but also how much the major (accounting) is appreciated, shared, understood, and discussed within the family, which is collectively reflected in FSMM.

It is true that more and better information on a major would not necessarily make it more attractive.

The details of the conceptual framework summarized here can be found in the Appendix.

Empirical framework

The key challenge in understanding the potential contribution of information asymmetry to the observed assortative patterns is to control for other characteristics that are not observed by the researcher but aggregated in FSA. To address this issue, we use a conventional intergenerational transmission framework, wherein we define quasi-likelihood functions with the response variables that take on fractional values of FSA between each child (son/daughter) and parent (father/mother) as a function of the spousal “appreciation” of each partner’s major and field-of-study relatedness. Intergenerational transmission refers to a process that outlines the transfer of individual characteristics, including abilities, preferences, and outcomes, from parents to their children, which we choose as our empirical framework. For example, an intergenerational model of schooling estimated in the literature (Becker and Tomes, 1979; Solon, 2013; Black and Devereux, 2010; Becker et al., 2015) can be expressed as follows:

Sc=α0+α1SP+α2hP+α2fP+ec$${{S}^{\text{c}}}={{\alpha }_{0}}+{{\alpha }_{1}}{{S}^{\text{P}}}+{{\alpha }_{2}} {{h}^{\text{P}}}+{{\alpha }_{2}}{{f}^{\text{P}}}+{{e}^{\text{c}}}$$

This reduced-form equation explains the child’s schooling (Sc) as a function of the parent’s schooling (Sp), heritable attributes that parents may genetically pass on to children (hp), parenting skills and preferences (fp), and child-specific characteristics (ec) independent from Sp, hp, and fp. Coefficient α1 reflects the causal effect of the parent’s schooling on the child’s schooling joined with, among others, the income effect that more education would be associated with better parental education. It can be shown that, if Equation (8) reflects the true model, a direct estimation of Equation (8) with unobserved hp and fp cannot identify α1, unless one assumes that endowments, hp and fp, are unrelated to Sp.

Holmlund et al. (2011) investigate the findings of a large number of studies to answer the following question: do more educated parents have more educated children because of their education? They show that the evidence is inconsistent across the other strategies (twins, adoptions, and Instrumental Variables models) and they could also encounter problems in obtaining bias-free estimates of causal intergenerational coefficients.

Hence, an estimation of Equation (8) without controlling for ability sorting and better parenting reveals the intergenerational elasticity between parent–child years of schooling, a summary measure of correlational associations between children’s outcome and parental educational background. Although it cannot answer whether more educated parents have more educated children because of their education, the intergenerational elasticity of schooling is a fundamental metric that has been used to measure the mobility across generations.

There are several studies examining intergenerational education and income mobility (elasticity) in Canada: Turcotte (2011), Aydemir et al. (2013), McIntosh (2010), Corak (2001, 2017).

Inspired from this literature, we propose a different identification strategy and start with 4 reduced-form matching functions that use the child’s assortative preferences aggregated in FSA as an outcome of transmission, a process that is built on available information based on the parents’ educational background.

FSAF,S=α0+α1NFSHM+α2NFORF+α3hM+α4fM+α5hF+α6fF+eS,$$\text{FS}{{\text{A}}_{\text{F,S}}}={{\alpha }_{0}}+{{\alpha }_{1}}\text{NFS}{{\text{H}}^{\text{M}}}+{{\alpha }_{2}}\text{NFO}{{\text{R}}^{\text{F}}}+{{\alpha }_{3}}{{h}^{\text{M}}}+{{\alpha }_{4}}{{f}^{\text{M}}}+{{\alpha }_{5}}{{h}^{\text{F}}}+{{\alpha }_{6}}{{f}^{\text{F}}}+{{e}^{\text{S}}},$$FSAM,S=β0+β1NFSHF+β2NFORM+β3hM+β4fM+β5fF+β6fF+μS,$$\text{FS}{{\text{A}}_{\text{M,S}}}={{\beta }_{0}}+{{\beta }_{1}}\text{NFS}{{\text{H}}^{\text{F}}}+{{\beta }_{2}}\text{NFO}{{\text{R}}^{\text{M}}}+{{\beta }_{3}}{{h}^{\text{M}}}+{{\beta }_{4}}{{f}^{\text{M}}}+{{\beta }_{5}}{{f}^{\text{F}}}+{{\beta }_{6}}{{f}^{\text{F}}}+{{\mu }^{\text{S}}},$$FSAF,D=δ0+δ1NSFHM+δ2NFORF+δ3hM+δ4fM+δ5hF+δ6fF+εD,$$\text{FS}{{\text{A}}_{\text{F,D}}}={{\delta }_{0}}+{{\delta }_{1}}\text{NSF}{{\text{H}}^{\text{M}}}+{{\delta }_{2}}\text{NFO}{{\text{R}}^{\text{F}}}+{{\delta }_{3}}{{h}^{\text{M}}}+{{\delta }_{4}}{{f}^{\text{M}}}+{{\delta }_{5}}{{h}^{\text{F}}}+{{\delta }_{6}}{{f}^{\text{F}}}+{{\varepsilon }^{\text{D}}},$$FSAM,S=β0+β1NFSHF+β2NFORM+β3hM+β4fM+β5hF+β6fF+μS,$$\text{FS}{{\text{A}}_{\text{M,S}}}={{\beta }_{0}}+{{\beta }_{1}}\text{NFS}{{\text{H}}^{\text{F}}}+{{\beta }_{2}}\text{NFO}{{\text{R}}^{\text{M}}}+{{\beta }_{3}}{{h}^{\text{M}}}+{{\beta }_{4}}{{f}^{\text{M}}}+{{\beta }_{5}}{{h}^{\text{F}}}+{{\beta }_{6}}{{f}^{\text{F}}}+{{\mu }^{\text{S}}},$$

where scripts M, F, S, and D denote mother, father, son, and daughter, respectively. With the normalized FSH (NFSH) and FOR (NFOR), these equations reflect the idea that a child’s assortative tendencies observed in his/her choice of major is related to the FSH and the degree of relatedness between each parent’s field of study and occupation within a family.

Given the parent’s major, the FSA reflects the child’s decision on a major that maximizes his/her expected utility. The theoretical foundation of this decision-making process is well-defined in the literature (Altonji et al. 2015). For now, we omit other child, parent, and family-specific attributes in Equations (9)(12).

As long as a higher homogamy (and occupational match) suggests a greater limitation in available information on alternative majors, the coefficients of NFSH (NFOR) capture the underlying field-of-study transmission that relates the children’s assortative preferences to the level of information asymmetry. The variable NFSHM in Equation (9), for instance, is bounded between 0 and 1. It reflects a perfect homogamy as it approaches 1. Intuitively, the α1 coefficient reveals how much the son’s preference for his father’s major will be affected by the extent to which his mother’s field of study becomes comparable. This reminds us of the earlier example: how much the son’s aspiration for his father’s major, accounting, will be affected if his mother was a biochemist instead of an accountant. Similarly, a positive and significant coefficient of NFOR validates the transmission as the parents would be more reliable transmitters of information when they work in their trained jobs. Hence, the presence of intergenerational transmission requires that the coefficients of NFSH and NFOR in those 4 equations should be positive, with dissimilarities reflecting the difference between maternal and paternal influences.

Yet, the identification of transmission due to information asymmetry across alternative majors requires controlling for ability sorting and unobserved heterogeneity. Defining each child’s FSA separately for each parent provides an opportunity to create a setting similar to panel models. Since we observe 2 matches for each child, when we take the difference between them, the dependent variables in these matching functions better reflect the assortative tendency because the omitted heterogeneity across children are differenced out from the equations as shown below.

FSAM,SFSAF,S=ω0+ω1NFSHFω2NFSHM+ω3NFORMω4NFORF+τc,$$\text{FS}{{\text{A}}_{\text{M,S}}}-\text{FS}{{\text{A}}_{\text{F,S}}}={{\omega }_{0}}+{{\omega }_{1}}\text{NFS}{{\text{H}}^{\text{F}}}-{{\omega }_{2}}\text{NFS}{{\text{H}}^{\text{M}}}+{{\omega }_{3}}\text{NFO}{{\text{R}}^{\text{M}}}-{{\omega }_{4}}\text{NFO}{{\text{R}}^{\text{F}}}+{{\tau }^{\text{c}}},$$FSAM,DFSAF,D=σ0+σ1NFSHFσ2NFSHM+σ3NFORMσ4NFORF+vc,$$\text{FS}{{\text{A}}_{\text{M,D}}}-\text{FS}{{\text{A}}_{\text{F,D}}}={{\sigma }_{0}}+{{\sigma }_{1}}\text{NFS}{{\text{H}}^{\text{F}}}-{{\sigma }_{2}}\text{NFS}{{\text{H}}^{\text{M}}}+{{\sigma }_{3}}\text{NFO}{{\text{R}}^{\text{M}}}-{{\sigma }_{4}}\text{NFO}{{\text{R}}^{\text{F}}}+{{v}^{\text{c}}},$$

A similar identification method is also recognized and applied by Diamond and Agarwal (2016) by using the repeated measurements made available when each agent on one side of the market is matched to at least 2 agents on the other side. The intuition is that the same value of the unobservable characteristic of an agent determines multiple matches of that agent and can be differenced out in a measurement error model (Hu and Schennach, 2008).

Models based on many-to-one matches are not new and are well-discussed in the literature (Roth and Sotomayor, 1992). The consequence of possible measurement errors in the dependent variable in our case may not result in attenuation bias but may inflate the standard errors of the estimates.

Unlike in other matching markets, this is particularly effective in our case because the assortativity revealed by FSA reflects only the child’s preferences defined over children, not over parents in matches.

These equations with within-parents differencing suggest that the difference FSAM,S and FSAF,S, for instance, should be smaller when NFSHM decreases, holding other covariates constant. Intuitively, if the mother married to an accountant holds a degree in biochemistry, NFSHM approaches its lower limit.

This statement is justified based on a strong field-of-study homogamy reported in Section 2.2.

As the mother becomes another channel of information on an alternative major, viz., biochemistry, the family information boundaries expand. Unlike the case when the mother was an accountant, this increase in the level of available information in turn reduces the son’s bias toward his father’s major, i.e., accounting. Therefore, FSAF,S (the son’s attraction to his father’s major) should be smaller when NFSHM (resemblance of the mother’s major to her husband’s, measured by spousal differences in the appeal of their majors) becomes lower. Hence, the differences in ω1 and ω2, as well as σ1 and σ1, will provide information about the difference in transmission between fathers and mothers. However, the value (and the volume) of the available information provided by the homogamy measures in the family depends on whether the parents work in related occupations. This could be better understood if we change the accountant–biochemist example to one where the father works as a chartered accountant while the biochemist mother works as a branch manager in a bank, which diminishes the value of information on biochemistry from the mother. Since the parents would be a better channel of information conditional on the quality of their occupational match, an increasing NFORM in Equation (13) should have both a negative impact on FSAF,S and a positive effect on FSAM,S. Hence, a positive and significant coefficient of NFORM indicates the existence of a transmission of field of study reinforced by expanding the reliable information within the family.

As outlined earlier, in addition to the level of information asymmetry built on the parents’ fields of study, children’s assortative preferences could also reflect ability sorting. The suggested within-family specifications can address this identification problem conditional on the assumption that the effects of unobserved parental traits in Equations (9) and (10), as well as in Equations (11) and (12), are statistically similar. Without this assumption and excluding NFOR for now, Equation (13) can be expressed as follows:

ΔFSAS=ω0+ω1NFSHFω2NFSHM+ω3hM+ω4fM+ω5hF+ω6fF+τc,$$\Delta \text{FS}{{\text{A}}_{\text{S}}}={{\omega }_{0}}+{{\omega }_{1}}\text{NFS}{{\text{H}}^{\text{F}}}-{{\omega }_{2}}\text{NFS}{{\text{H}}^{\text{M}}}+{{\omega }_{3}}{{h}^{\text{M}}}+{{\omega }_{4}}{{f}^{\text{M}}}+{{\omega }_{5}}{{h}^{\text{F}}}+{{\omega }_{6}}{{f}^{\text{F}}}+{{\tau }^{\text{c}}},$$

where ω3= (β3α3), ω4= (β4α4), ω5= (β5α5), and ω6= (β6α6). When estimated by ordinary least squares (OLS), identification of ω2 (ω1) requires either that NFSHM (NFSHF) is independent of unobserved parental traits or that ω3, ω4, ω5, and ω6 are 0, as shown below.

plimω2ols^=ω2+ω3cov(NFSHM,hM)var(NFSHM)+ω4cov(NFSHM,fM)var(NFSHM)+ω5cov(NFSHM,hF)var(NFSHM)+ω6cov(NFSHM,fF)var(NFSHM).$$\begin{align}& \text{p}\lim \widehat{{{\omega }_{2ols}}}={{\omega }_{2}}+{{\omega }_{3}}\frac{\text{cov}\left( \text{NFS}{{\text{H}}^{\text{M}}}\text{,}{{\text{h}}^{\text{M}}} \right)}{\text{var}\left( \text{NFS}{{\text{H}}^{\text{M}}} \right)}+{{\omega }_{4}}\frac{\operatorname{cov}\left( \text{NFS}{{\text{H}}^{\text{M}}},{{f}^{\text{M}}} \right)}{\operatorname{var}\left( \text{NFS}{{\text{H}}^{\text{M}}} \right)}+ \\ & {{\omega }_{5}}\frac{\text{cov}\left( \text{NFS}{{\text{H}}^{\text{M}}}\text{,}{{\text{h}}^{\text{F}}} \right)}{\text{var}\left( \text{NFS}{{\text{H}}^{\text{M}}} \right)}+{{\omega }_{6}}\frac{\operatorname{cov}\left( \text{NFS}{{\text{H}}^{\text{M}}}\text{,}{{\text{f}}^{\text{F}}} \right)}{\operatorname{var}\left( \text{NFS}{{\text{H}}^{\text{M}}} \right)}. \\ \end{align}$$

First, we think that parents’ child-rearing skills, fM and fF, should not be significantly correlated with the homogamy measures NFSHM and NFSHF. It would be hard to find a systemic reason why individuals who choose their spouses in the same field of study would also be the future parents with more skills in rearing their children. Thus, a possible bias in the estimate of ω2 should mostly originate from heritable traits, i.e., hM and hF, and their correlation with homogamy measures. To the extent that a field of study reveals the person’s overall ability endowments, it would be reasonable to question the role of ability sorting in field-of-study matches. But, it is ambiguous how this possibility translates into nonzero cov(NFSHM,hF) and cov(NFSHM,hM).

If we assume that h represents heritable mathematical skills, for instance, a higher NFSHM could be related to a higher and a lower hF (or hM) at the same time.

While we could observe a high NFSH for engineers and historians, they would have different mathematical skill endowments.

To test this ambiguity, we can use matches where both spouses have at least a university degree in one of the following majors: science, technology, engineering, and math (STEM). Hence, what we observe by a higher or lower NFSH among STEM majors should be the differences in assortative preferences isolated from ability sorting. In other words, if NFSH is relatively higher for electrical/ computer engineers, it means that they mostly choose their partners in similar fields instead of in theoretical statistics or chemical engineering, which are otherwise comparable in terms of ability requirements. The size of the data enables us to reduce the effect of cov(NFSHM,hF) and cov(NFSHM,hM) on the bias by estimating specifications (13) and (14) only for families that have similar ability endowments. Hence, as shown below, introducing a binary variable – STEM, which is 1 if both parents hold at least a university degree in one of the STEM majors, and 0 otherwise – into Equation (15), would help us address a possible bias in the transmission coefficients.

ΔFSAS=ω0+ω1NFSHFω2NFSHM+ω3STEM+ω4(STEM×NFSHF)ω5(STEM×NFSHM)+ω6NFORMω7NFORF+ω8fF+ω9hM+ω10fM+ω11hF+τC.$$\begin{align}& \Delta \text{FS}{{\text{A}}_{\text{S}}}={{\omega }_{0}}+{{\omega }_{1}}\text{NFS}{{\text{H}}^{\text{F}}}-{{\omega }_{2}}\text{NFS}{{\text{H}}^{\text{M}}}+{{\omega }_{3}}\text{STEM}+{{\omega }_{4}}\left( \text{STEM}\times \text{NFS}{{\text{H}}^{\text{F}}} \right)- \\ & {{\omega }_{5}}\left( \text{STEM}\times \text{NFS}{{\text{H}}^{\text{M}}} \right)+{{\omega }_{6}}\text{NFO}{{\text{R}}^{\text{M}}}-{{\omega }_{7}}\text{NFO}{{\text{R}}^{\text{F}}}+ \\ & {{\omega }_{8}}{{f}^{\text{F}}}+{{\omega }_{9}}{{h}^{\text{M}}}+{{\omega }_{10}}{{f}^{\text{M}}}+{{\omega }_{11}}{{h}^{\text{F}}}+{{\tau }^{\text{C}}}. \\ \end{align}$$

The coefficients of interaction terms will reveal the differences in the sons’ assortative preferences in STEM families. With the within-family specification, 2 factors will shrink the bias on these coefficients: first, the differential effects of unobservables, ω8= (β3α3) and ω10= (β5α5) in Equation (17), as opposed to their levels in specifications (9)–(12), will diminish their size; and second, cov(STEM × NFSHM,hF) and cov(STEM × NFSHM,hM) will be close to 0 for a subsample as specified by Equation (17). The definition of the bias in the estimate of ω5, for instance, can be expressed as follows:

plimω5ols^=ω5+(β3α3)cov(STEM×NFSHM,hM)var(STEM×NFSHM)+(β5α5)cov(STEM×NFSHM,hF)var(STEM×NFSHM).$$\begin{align}& \text{plim}\widehat{{{\omega }_{5ols}}}={{\omega }_{5}}+\left( {{\beta }_{3}}-{{\alpha }_{3}} \right)\frac{\operatorname{cov}\left( \text{STEM}\times \text{NFS}{{\text{H}}^{\text{M}}},{{h}^{\text{M}}} \right)}{\operatorname{var}\left( \text{STEM}\times \text{NFS}{{\text{H}}^{\text{M}}} \right)}+ \\ & \left( {{\beta }_{5}}-{{\alpha }_{5}} \right)\frac{\operatorname{cov}\left( \text{STEM}\times \text{NFS}{{\text{H}}^{\text{M}}},\,{{h}^{\text{F}}} \right)}{\operatorname{var}\left( \text{STEM}\times \text{NFS}{{\text{H}}^{\text{M}}} \right)}. \\ \end{align}$$

Hence, the size and the significance of the coefficients ω5 and ω6 will reveal whether cov(NFSHM,hF) and cov(NFSHM,hM) can reasonably be assumed to be 0. The next section will provide the results.

Estimation results
Without within-family differencing

We start with the 4 equations from Equation (9) to Equation (12). To reduce the unobserved heterogeneity across families, we expand the equations by controlling for household income, provincial fixed effects, first spoken official language, household size, and whether the family resides in an urban or rural area. We also control for homogamy in terms of parents’ highest educational degree.

Education-degree homogamy (EDH) is calculated similar to FSH by using Equation (1). A total of 11 major granting educational degrees are identified in the 2011 NHS: trades, registered apprenticeship, college - <1 year, college - 1–2 years, college - >2 years, university - below bachelor’s, bachelor’s, above bachelor’s - less than master’s, medicine-dentistry-veterinary, master’s, and PhD.

After these additions, Table 5 reports 2 sets of estimation results for the selected variables.

Since our specifications have fractional response variables that have values ranging between zero and one, their linearity in this range becomes a question. To address this issue, we have also estimated all specifications in this section with quasi-likelihood methods where the response variables are transformed to log odds using the binomial distribution (Papke and Wooldridge, 1996). Since the results are almost the same, we report here only the linear specifications estimated by OLS.

The first 4 columns report the estimation results, which include NFSH for each parent without accounting for parents’ occupational relatedness. We control for FSH in the last 4 columns as a binary variable –1 if both parents have the same field of study, and 0 otherwise – and add FOR, for both father and other, as a categorical variable, FORC, which is equal to 1 if the normalized FOR is <0.2 and 0 otherwise. The first 4 specifications use larger subsamples because they exclude FOR, which can be identified only if the person’s occupation is known.

Intergenerational transmission of field of study with 41 major CIP codes

12
FSA–SonFSA–DaughterFSA–SonFSA–Son
FatherMotherFatherMotherFatherMotherFatherMother
FSH = 1 (if same major)0.0550.0020.0250.015
0.0010.0080.0010.062
NFSHM0.0500.051
0.0060.005
NFSHF0.1000.038
0.0030.002
FORCF= 10.0440.027
0.0020.001
FORCM= 10.0200.016
0.0010.003
EDH0.0460.0270.0180.0250.0400.0450.0250.026
0.0030.0020.010.0070.0010.0030.0010.002
Household income 0.0090.0020.0060.0010.0070.0020.0050.001
0.0010.0090.0010.0020.0010.0090.0010.004
Constant0.3470.5290.4640.4340.3900.6150.4640.434
0.0000.0000.0000.0000.0000.0000.0000.000
Number of Obs.23,74823,74822,25422,25421,10120,01620,19519,159

Notes: (1) Dependent variables are indicated in each column’s heading. (2) Standard errors reported under the coefficients are adjusted by using the two-way clustering method (Cameron et al., 2011) at the individual and household levels. (3) EDH reflects education-degree homogamy and is a continuous variable normalized between 0 and 1. HH Income is the annual disposable income for the household. Other variables that are not reported in the table control for household size, first spoken official language, whether the family is in rural area, and provincial fixed effects. (4) We also ran the regressions with and without the parental age variables. The results are insensitive to the inclusion of parental age variables. (5) When we control for field-of-study fixed effects, the results do not change significantly.

The results reported in Table 5 are informative as they reflect the maternal and paternal differences in children’s assortative preferences in choosing majors. The robust and positive NFSH coefficients provide evidence for the existence of what we call intergenerational transmission of field of study. As outlined before, the results reflect the combination of ability sorting, differences in parenting skills, and unobserved heterogeneity in individual and family characteristics, in addition to the limited information accessibility constrained by the parents’ fields of study. The first 2 columns show that the son’s attraction to his parents’ majors is strongly related to the FSH, measured by spousal “appreciation” of each parent’s major. A comparison of the coefficients (0.10 and 0.05) indicates that the paternal influence is more dominant in educational transmission for sons. A similar gap is not observed for daughters reported in the third and fourth columns. The robust NFSH coefficients still suggest that daughters will also be attracted to their parents’ field of study, yet mothers have more influence on daughters.

In the last 4 specifications, we distinguish the parents who have the same field of study and control for their occupational match. The results are consistent with those of the first 4 specifications. The effect of having homogamous parents on the son’s attraction to his father’s major (0.055) is much higher than his attraction to his mother’s field of study (0.002). Again, the same significant but smaller difference can be observed for daughters. The second channel to identify the transmission is the relatedness of parents’ field of study to their occupation, which is controlled by FORC in the last 4 estimations. The results confirm a strong and positive relationship between the parents’ occupational match and the children’s attraction to their parents’ majors. The parental difference in this effect is also noticeable and in line with the earlier findings with FSH: the paternal effect is greater than the maternal influence for sons, while the same difference is less magnified for daughters. When it comes to other factors, a higher homogamy in terms of educational degree (EDH) is positively and significantly associated with FSA. Similarly, a higher household income has a positive effect on FSA. Among the other variables not reported in Table 5, only the urban–rural distinction in households’ location is significant. Children from families in larger cities experience higher FSA.

To test the robustness of the results in Table 5, we also used different levels of the CIP and occupation classifications available in the 2011 NHS. The results are not sensitive to using larger or smaller dimensions of match tables.

Within-family differencing

We address the identification with a within-family differencing as described in the previous section and report the results in Table 6.

Transmission of field of study by within-family specification with 41 major CIP codes

(FSAM,S) − (FSAF,S)(FSAM,D) − (FSAF,D)
1212
(NFSHF) – (NFSHM)0.1530.014
0.0020.012
DFORC
1BaseBase
20.0330.027
0.0010.001
30.0350.163
0.0010.068
NFSHM0.1010.005
0.0010.023
NFSHF0.1620.021
0.0010.018
FORCM= 10.0160.016
0.0010.071
FORCF= 10.0510.029
0.0010.001
Constant0.0690.0260.0760.070
0.0010.0020.0010.001
Number of Obs.21,01821,01819,94219,942

Notes: (1) Dependent variables are indicated in each column’s heading. (2) Standard errors reported under the coefficients are adjusted by using the two-way clustering method (Cameron et al., 2011) at the individual and household levels.

The first 2 columns show the estimation results of Equation (13) with the same dependent variable, the difference in the son’s attraction to his parents’ majors. The first column reports the estimation results of the restricted version of Equation (13). The estimation results for daughters based on Equation (14) are reported in the last columns. The restricted specifications in the first and the third columns use a new binary variable, NFORC, which reflects the difference in NFOR in 3 categories; the base category refers to the case that both parents have the same field-of-study relatedness. Either both work in related jobs (e.g., NFOR is between 1.0 and 0.2 for both parents) or in unrelated jobs (e.g., NFOR is between 0.2 and 0.0 for both parents). The second category indicates that while the father works in a matching occupation, the mother does not. The third category specifies the opposite situation. Hence, the effect of parental differences in field-of-study relatedness can be captured by the last 2 categories.

We define the base category with two opposite cases, either both parents work in related jobs or unrelated jobs, because we want to estimate the effect of field-of-study relatedness for each parent. Given that the dependent variable is the difference in child’s attraction to each parent’s major, this effect can only be captured when parents’ FOR is different.

,

The idea here is to identify the effect of FOR on the children’s attraction to their parents’ major, when the parents work in an unrelated occupation. Therefore, we actually tried to find the lowest cutoff point that realistically classifies the person’s job completely unrelated to his/her training. We also applied higher thresholds up to 0.4. The results are still robust. This is mostly because the distribution of FOR is convex. Hence, increasing the threshold from 0.2 to 0.4 had a minor effect because relatively few people exist in the bin of 0.2–0.4.

The results are interesting and in line with the findings in our earlier estimations: the coefficient of NFSHF in the first column, 0.153, confirms that (FSAM,S) – (FSAF,S) is greater when the gap between the appeal of each spouse’s major, (NFSHF) – (NFSHM), increases. Although this signifies the presence of intergenerational transmission, it does not offer an insight about the parental difference. This is because the gap could rise when NFSHF goes up, NFSHM goes down, or both occur simultaneously. The second column, based on Equation (15), helps us understand the difference. The sign of the coefficients on NFSHF and NFSHM are as expected. Since an increase in NFSHM has a positive impact on FSAF,S, it reduces the distance between FSAM,S and FSAF,S. When the mother’s major becomes similar to the father’s, NFSHM rises. Because higher homogamy implies more constraint in the family in terms of available information on other majors, the son’s bias toward his father’s field of study rises. This is confirmed by the negative sign of the NFSHM coefficient. Equally, when NFSHF rises, the similarity between parents’ major becomes higher. Constrained by less information being available regarding other majors, the son’s attraction toward his mother’s field of study increases. This is verified by the positive sign of the NFSHF coefficient: because a rise in NFSHF increases FSAMS, the distance between FSAMS and FSAFS becomes larger. The difference between these effects (0.101 and 0.162) again suggests that paternal influence is noticeably greater than maternal influence for sons. The same comparison for daughters in both specifications of Equation (14) would not offer the same evidence, which is also consistent with the relatively weaker effects for daughters reported in Table 5.

The existence of intergenerational transmission is also verified by the effect of the parents’ field-of-study relatedness. In the first column, when evaluated against the base, the first category (fathers work in their trained job but mothers do not) has a negative effect on (FSAMS) – (FSAFS). Similarly, a significant positive effect is observed for the second category, wherein the mother works in her trained job but the father does not. These results are also confirmed with the unrestricted specification reported in the second column. Now, using FORC, if the mother’s major is not a good fit for her occupation, the negative coefficient (–0.016) indicates that FSAMS falls. Yet, when the father faces an educational mismatch in his job, the effect on (FSAMS) – (FSAFS) captured by a positive coefficient (0.051) becomes much greater. Interestingly, despite the insignificant effects of NFSH in the third and fourth columns, significant effects of FORC are observed for daughters, indicating the importance of parents’ occupational matching in transmission.

Within-family differencing among families with STEM majors

With within-family differencing as specified by Equations (13) and (14), the other factors, such as the effects of siblings, neighborhoods, and peers, either observed or unobserved, are differenced out in the estimations. Hence, the results deliver better evidence about the role of information constraint in children’s assortative preferences. However, as outlined earlier, the success of this identification strategy is conditional on the extent to which the FSH is driven by the ability sorting in parents’ marriage.

One way to address this problem is to use a subsample that includes only those families in which both parents hold at least a university degree in one of the STEM majors so that the difference in terms of their ability endowments would not be significant. Table 7 reports the estimations of the same specifications shown in the second and the last columns of Table 6 with STEM variables as expressed by Equation (17).

Within-family specification for STEM parents with 41 major CIP codes

(FSAMS) − (FSAFS)(FSAMD) − (FSAFD)
Coef.Std. err.Coef.Std. err
NFSHM0.0790.0020.0010.370
NFSHF0.1610.0010.0200.019
STEM0.1510.0010.0190.176
STEM × NFSHM0.0420.1010.0370.306
STEM × NFSHF0.0370.0980.1030.423
FORCM= 10.0170.0010.0160.004
FORCF= 10.0420.0020.0270.001
Constant0.0270.0030.0730.003
Number of Obs.21,01819,942

Notes: (1) Dependent variables are indicated in each column’s heading. (2) Standard errors are adjusted by using the two-way clustering method (Cameron et al., 2011) at the individual and household levels. Coef. = coefficient; Std. err. = standard error.

When one of the parents holds a degree in one of the non-STEM majors (or less than a bachelor’s degree), the coefficients of NFSHM and NFSHF (–0.079 and 0.161) are almost identical to those reported in Table 6 for sons. This could be plausible given that the share of families where both parents have a STEM major with at least a university degree is <20% in the whole sample. The insignificant interaction terms indicate that ability sorting may not play a strong role in sons’ assortative preferences. Hence, when the comparison is made only among STEM parents, the difference between paternal and maternal effects observed in field-of-study transmissions tends to remain similar to those found in our earlier results. The significant effect of STEM implies that the difference between FSAMS and FSAFS is lower for STEM families than for non-STEM families. None of the results are significant for daughters, except for FORC, which is in line with our earlier findings. There is a large literature on gender differences in occupational preferences and major choices. However, we do not have a satisfactory explanation why daughters’ assortative preferences show no evidence of the link between parents’ homogamy and their assortative preferences in their choice of major.

Limitations

The total elasticity of the assortative preferences in terms of parental homogamy can be expressed for sons by the sum of the coefficients NFSHM and NFSHF, which is 0.24 (–0.079 and 0.161) from Table 7. This measure suggests an important role of information asymmetry in children’s choice of major to the extent that the FSH reflects the level of constraint on the available information when children choose a major. It should be noted that the results reported here are conditional on a couple of assumptions. Although our sample, children living with their parents, is representative of the whole sample, there would still be a selection problem whereby children living with their parents may have different behavioral predispositions that affect their assortative preferences.

Second, our underlying model is static and uses data that includes children mostly with completed majors. The evidence in the literature is very clear that students update their beliefs in their first years of study and switch majors, if the cost is endurable. We believe that using data on completed majors leads to a downward bias in our estimations.

Third, the constraint on available information in a family measured by the FSH would not necessarily suggest a positive bias in children’s choice toward their parent’s majors. Although it is less likely, 2 accountant parents would not necessarily be in favor of their child taking up their major and may deter their children from their own majors. This possibility would also create a downward bias in our estimations.

Finally, as is very common in most empirical studies in the field of education economics, our attempt to remove a possible ability bias from our estimations has its own limits. We think that specifications that use within-family differencing and a proxy that groups families with similar ability endowments substantially shrink the bias. Still, within-family differencing may have some other complications in our estimations. For example, a son’s attraction to both parents’ majors may not be homogeneously comparable, when each parent’s attraction to their own field of study strongly reflects their gender preferences. There is an extensive literature on gender differences in field-of-study preferences (Zafar, 2013). Our hypothesis in this study implies that, when his mother’s field of study becomes distinct from his father’s major, the son’s attraction to his father’s major will be affected negatively. This is because the field-of-study diversity in the family will expand the information boundaries and consequently his choice set on majors. This may not be true, i.e., he will not be less attracted to his father’s major, if his mother’s choice of field of study is strongly gender based. Our expectation is that a possible bias due to this issue would lead to underestimation of the true effect of information constraints.

With all these caveats, we still believe that the transmission coefficients provide very valuable information on the intergenerational field-of-study elasticity, which is the first in the literature, to the best of our knowledge.

Concluding remarks

The potential spillover effect of education is a fundamental public policy matter because it may lead to progressive skill stratifications and dispersed income distributions in every generation if ability sorting in mating and across generations is substantial. Most studies use years of schooling as the educational outcome for children, treating education as unidimensional. Yet, educational decisions are no longer just about the quantity but about the specialization to be pursued as well. This study quantifies assortative mating by estimating FSH and intergenerational transmission of skills by measuring assortative preferences in the choice of major. As uncertainty increases with the complexity of educational choices, misinformed decisions made by students in choosing their field of study or by administrators in allocating their limited resources across disciplines would curtail social and economic progress. This study’s primary objective is to investigate children’s attraction to their parents’ field of study, reflected by assortative tendencies in child–parent matches as an outcome of information asymmetry.

To identify the role of information asymmetry in assortative patterns in each field-of-study match between parents and children, we define quasi-likelihood transmission functions, wherein the response variables take on fractional values of FSA between each child (son/daughter) and parent (father/mother) as a function of the spousal “appreciation” of each partner’s field of study. We use the confidential major file of the 2011 NHS so that the size of the data and the availability of different levels of aggregation in the CIP allow us to develop 3 indicators: the degree of children’s attraction to their parents’ field of study (FSA), the degree of FSH, and the degree of relatedness between each parent’s field of study and occupation (FOR).

Comparable to panel models, we define within-family transmission functions with 1-to-2 matches (1 for each parent). The results show that children’s choice of field of study exhibits significant assortative preferences isolated from ability sorting and unobserved differences across majors and other family characteristics. We also find that the assortative tendency is the highest between fathers and sons relative to all other pairs, namely, father–daughter, mother–son, and mother–daughter. This evidence becomes even stronger when we use more disaggregated CIP codes and control for the educational degrees. With some caution, we attribute this persisting assortative tendency to the information asymmetry across alternative majors built on by parents’ educational backgrounds within families.