Open Access

How general is managerial human capital?: Evidence from the Retention of Managers after M&As


Cite

Introduction

How general is managerial human capital? Market-based views on managerial compensations explain high CEO compensation in the U.S. as resulting from the competition to employ scarce managerial talent (e.g., Gabaix and Landier, 2008; Terviö, 2008). However, the validity of these theories critically depends on the answer to this question. This question is also related to the sensitivity of managerial turnover to poor performance by a firm (e.g., Eisfeldt and Kuhnen, 2013). Thus, it is one of the most important empirical questions in the literature on corporate governance.

The empirical studies on the retention of managers after M&As may provide some evidence to assist in answering this question. Although several theories of takeovers presume that top managers in a target firm must be replaced after a takeover

Some research considers takeovers as a disciplinary device (e.g., Martin and McConnell, 1991). Alternatively, Shleifer and Summers (1988) argue that a takeover causes a breach of trust with stakeholders and transfers rent from them to shareholders. Jovanovic and Rousseau (2002) emphasize that a takeover can reallocate capital to better manage a firm. All of these theories presume that top managers in a target company must be replaced after a takeover. On the other hand, the synergy view of takeovers, which is supported by McGuckin and Nguyen (1995) and Matsusaka (1993), does not predict the replacement of top managers.

, there is increasing evidence that the retention of a management group is important to the new firm

Matsusaka (1993) finds that the retention of managers in a firm targeted for takeover increases the bidder's return, and Cannella and Hambrick (1993) and Zollo and Singh (2004) find that the departure of executives from acquired firms is harmful to postacquisition performance.

. This evidence suggests that there are important skills held by executives of a target firm that a new management group in an acquiring company cannot easily replace.

Interestingly, the existing literature finds that the tenure of a CEO does not have any significant effect on his/her probability of retention after takeover (e.g., Buchholtz et al., 2003; Wulf and Singh, 2011). Given that experience in a firm is assumed to result in the development of firm-specific skills, the lack of such an effect places doubt on the hypothesis that firm-specific skills are required to manage a newly merged firm after a merger and acquisition (M&A).

It appears that managers’ tenures may have a negative impact on their retention rate after an M&A. If it is expected that managers with long tenures will have difficulty adapting to the new environment - on the basis that the smaller number of firms in which their experience is gained reduces their opportunity to improve their general human capital - a newly merged firm may not want to employ a manager with a long tenure (e.g., Buchholtz et al., 2003). Therefore, this could offset the positive effect of tenure on the retention rate.

This paper examines how the tenure of managers influences the retention rate of a management group after M&As in Japanese companies during the period 1990–2006. It attempts to separate the two different hypotheses regarding the effect of tenure on separation probability. We show that negative and positive effects of tenure on the managers’ separation rate coexist in the case of Japanese M&As.

The main challenge to achieving our goal is developing a way to cope with the limitations of the data. Ideally, we need a fairly rich dataset, which includes the random assignment of target managers and their history afterward, including not only their career paths, but also wage payments and obtained skills. Unfortunately, it is not possible to obtain such data for Japanese firms. These data restrictions force us to develop a new model and a new method that can extract meaningful information from the restricted dataset. We develop a general equilibrium model that is designed to analyze the separation of target-firm managers after M&As. It is shown that our structural model can be approximately estimated by a stratified Cox proportional hazards model. In addition, we propose a novel method to correct for selection biases arising from unobserved heterogeneity of managers.

Equipped with the constructed theoretical model and the new estimation method, we conduct survival analyses of Japanese titled directors, who are considered to be the top executives in Japanese companies

Comparing U.S. CEOs and Japanese presidents, Kaplan (1994) argues that, because decisions in a Japanese firm are made on a consensus basis, it is important, for the sake of accurate comparison, to include other directors, not only presidents. Saito and Odagiri (2008) argue that, because there is heterogeneity among directors, not all directors are important decision makers and, therefore, they choose to focus on directors with titles as the important decision makers in Japanese firms. We follow Saito and Odagiri (2008) and identify directors with titles as the relevant executives in the sample target firms.

. Because the Japanese promotion system is known to encourage investment in firm-specific human capital, we expect that the advantages and disadvantages of long tenure can be more accurately measured and investigated in Japanese companies than is the case for other countries.

We consider two sources of managerial human capital in this paper: experience as employees and managerial experience, which is measured by experience as a board member. Our empirical analyses show that an increase in a manager's tenure as an employee in a target company increases both the probability of being appointed to the board in a merged firm and the subsequent separation probability after this appointment. However, a longer tenure as a board member within the target firm does not have any significant impacts on either appointment probability or separation probability. Through the lens of our theory, we can interpret these findings as follows: 1) Japanese firms after M&As value both the target firm-specific human capital and the general human capital of managers; 2) experience as an employee increases firm-specific skills, but at the expense of the accumulation of general human capital; and 3) Managerial experiences are likely to be more general than those as an employee.

There have been many attempts to understand the retention of managers after M&As

Walsh (1988), Walsh (1989), and Walsh and Ellwood (1991) investigate several factors that influence the turnover of top managers after M&As. More recently, Wulf (2004), Hartzell et al. (2004), and Bargeron et al. (2009) focus on the power of target CEOs to negotiate benefits, which includes a position in a new firm, as an expense borne by shareholders. Finally, Mateos de Cabo et al. (2014) investigate how individual characteristics, including gender or membership of a minority group, influence appointments to directorships after M&As.

. This paper makes two main contributions to the literature. First, to the best of our knowledge, ours is the first paper that constructs a structural model of managers’ retention after M&As. Second, we propose a novel method to correct for selection biases arising from the unobserved heterogeneity of managers, which does not require a random sample from the population. These two contributions are discussed in detail below.

We develop a structural model to disentangle the several mechanisms underlying the causal effects of tenure on the separation of target managers after M&As given several data limitations. However, in contrast to the standard structural estimation approach of, for instance, Rust (1987) and Hotz and Miller (1993), we do not attempt to estimate all structural parameters. Instead, we focus on particular parameters of interest, namely the coefficients of the effects of tenure on the separation rates. Our model is designed to clarify the conditions under which we can obtain economically meaningful interpretations from these parameters. Given the data limitations that we face, this strategy allows us to utilize weaker parametric assumptions for our identification, resulting in greater transparency of the source of variations for the identification

Similar to our empirical strategy, Chetty (2009) and Heckman (2010) discuss the benefits of estimating wellfocused parameters rather than full structural parameters.

.

Let us describe the intuition on how to identify some mechanisms from the coefficients of the estimated effect of tenure on turnover in conditions where the data are constrained. In particular, we are concerned about two data problems: the lack of data about skills and the lack of data about compensation. First, because we cannot observe skills, we must differentiate two different hypotheses regarding the length of tenure and the separation probability of managers. Our identification strategy is built on the recognition that, although target-specific human capital is important as an input for current production, employment of managers with the ability to learn in a new environment is an investment in future production. More concretely, on the one hand, because managers in an acquiring firm do not have much knowledge of the target firm's organization, the acquirer initially regards the specific skills held by the target firm managers as highly valuable in managing and restructuring the target firm. On the other hand, because target managers accumulate knowledge of the new firm over time, the difference between the amount of knowledge accumulated by a manager who learns quickly and one who is a slow learner becomes larger as time goes by

This intuition may not be true if the amount of total knowledge that must be learned is so little that a quick learner can immediately understand everything and if an acquiring firm can wait until a slow learner catches up. However, this case is less likely to be important for our empirical studies. First, the literature on the learning curve (e.g., Jovanovic and Nyarko, 1995) suggests that a rise in productivity continues for a long time, especially if tasks are complex, which is likely to be the case for management jobs. Second, as most firms are subject to seasonal events, it is less likely that every operation can be understood without spending at least one year in the firm. Therefore, we focus on one year as the period over which extensive learning occurs in our empirical study. Third, because, as Kaplan (1994) argues, the decisions in many Japanese firms (the focus of our empirical studies) are made on a consensus basis, it is important to learn what people think about a particular strategy. As suggested by the literature on higher-order beliefs, learning about the beliefs of other people on random objects is more difficult than learning about the random objects themselves (see Veldkamp, 2011). Hence, it is likely that a target-firm manager needs time to learn the consensus views held by the managers in the new company. Finally, there is no economic reason for an acquiring firm to keep a slow learner until they start to catch up.

. It is shown that the timing of separation reflects the relative importance of each type of human capital.

This seemingly intuitive argument requires nontrivial theoretical consideration of learning capability. Although the argument in the previous paragraph implicitly assumes that a newly merged firm is willing to employ managers with high learning capability, this may not be true. Because the ability to learn in a new environment is a general ability, the market competition can raise the compensation of able managers, which may make the firm reluctant to employ them. Therefore, the validity of the above argument depends on the structure of the external labor market for managers. Because we developed a general equilibrium model, we can discuss how the frictions in the managerial external labor market influence the coefficients of the effects of tenure on the separation probability.

In addition, the coefficients of the effects of tenure might be influenced by the second data problem: the lack of compensation data for Japanese managers. It is not obligatory for publicly traded companies to provide information on the compensation of managers in Japan. As suggested in the literature (e.g., Wulf, 2004, Hartzell et al., 2004, and Bargeron et al., 2009), this might cause several biases if managers in target firms can negotiate the compensation and severance pay in the newly merged firm during the process of an M&A. Our theoretical model shows that if it is impossible to make a complete contract on treatment of the new managers in the new firm during the M&A and there are some explicit or implicit transfers after the new firm starts as a result of ex post bargaining, then the separation probability does not depend either on the compensation or on the severance pay of managers. That is, the model provides a theoretical condition under which we can extract meaningful information from the coefficients of the impact of tenure on the separation rate even when compensation data are not available.

Next, we explain the second main contribution to the literature. In fact, this second contribution can be considered the more important contribution because it has a broader relevance than the post-M&A managerial retention literature. Note that because managers in a target company are not randomly assigned, the coefficient of tenure can be contaminated by another effect. If a talented person is promoted to a management position faster than is usually the case, the short tenures observed in our data may simply indicate that a manager has a high unobserved ability.

In order to deal with these selection biases, we propose a novel method to extract information about the unobserved heterogeneity of workers from the data on managers. Intuitively, if a talented person is promoted to a management position faster, the length of tenure prior to becoming a manager must contain useful information about their unobserved ability. We utilize this information to identify and control for unobserved ability to estimate the causal effects of tenure.

Compared with the standard sample selection models that have appeared in econometrics textbooks (e.g., Wooldridge, 2002; Cameron and Trivedi, 2005), our approach has three advantages. First, because we utilize the timing of selection in a selected sample to correct for selection bias, we do not need to access a random sample from the population. We are not aware of any other papers that provide a tractable method to correct a selection bias inherent to personnel data using only the selected sample. Second, because the timing of observations is generally different from the timing of promotions, we can find several exclusive variables that are required to identify parameters and obtain reliable estimates. Finally, although a standard two-step estimator makes distributional assumptions to correct selection biases, we rely on an extreme value theory to obtain robust results against misspecifications of distributional assumptions. Gabaix and Landier (2008) argue that an extreme value theory can provide a nice approximation for the upper tail of a large class of continuous distributions. We apply their argument and derive an explicit solution to the conditional expectation of unobserved ability under a selected sample, which is used to determine unbiased estimates of parameters under a selected sample.

As discussed before, this methodology is applicable to a broader literature than that of managerial retention after M&As. Although it is recognized that the characteristics of CEOs change the strategies of firms (e.g., Bertrand and Schoar, 2003), researchers face difficulties in examining the causal role of CEO characteristics in a firm because publicly available data on top managers typically involve selected samples. We propose a method to deal with this problem. Moreover, our methodology can be applied to data on internal promotions, when researchers are not able to access data on employees’ whole careers. We hope that this novel approach can help researchers struggling with data limitations to evaluate the role of leaders.

In addition to these two main contributions to the literature, our results contribute to several streams of literature, including those on managerial compensation and Japanese M&As. We clarify the contribution of our results to these fields in Section 8.

The paper is organized as follows. Section 2 presents a general equilibrium model of the separation of target managers after M&As. It shows how target firm-specific human capital and general human capital can influence the separation decision after M&As. Section 3 links the length of tenure and each type of human capital and provides a number of testable conditions that distinguish several hypotheses on managerial human capital based on the evidence of the effect of tenure on separation probability. Section 4 introduces our empirical model and discusses how to correct a selection bias using our selected dataset. Section 5 describes our Japanese dataset and Section 6 presents our estimation model and our results, interpreted through the lens of our theory. Section 7 discusses whether the moral hazard of managers could influence the interpretation of our results. Section 8 relates our results to several streams of literature that cover managerial compensation and Japanese M&As. Section 9 concludes the paper.

A General Equilibrium Model of Managerial Separation after M&A

In this section, we construct a general equilibrium model that is designed to examine the separation of target managers after M&As using the following procedure. First, we discuss the assumptions on the production functions, which describe how target firm-specific human capital and general human capital influence the production process. Second, using the production functions, we model the separation of a manager from a target firm after an M&A by taking the outside values of managers as given. Third, we model an external managerial labor market, which allows us to determine the outside values of the managers. Finally, we show how our structural model can be approximately represented by the proportional hazards model and discuss how target firm-specific human capital and general human capital influence the parameters of the proportional hazards model.

Target Firm-Specific Human Capital vs. General Human Capital

Suppose that when a firm is merged with another firm, managers from the target firm must learn new skills and/or new routines to manage the merged firm in its new environment. We assume that all learning takes place at time t = 0 and that normal firm operations start after time t = 1, where the analysis time t is the length of time of operation after the year in which the new firm was created by the M&A. Assume that a manager in a target firm has general human capital hG and human capital specific to the target firm, hT. Suppose that the productivity of the manager with human capital h = {hG, hT} is described by pt(h): pth=fGshG+fTshT+ζt:X,s=lift=0=nift1, \matrix{ {{p_t}\left( {\bf{h}} \right) = {f_{Gs}}\left( {{h^G}} \right) + {f_{Ts}}\left( {{h^T}} \right) + \zeta \left( {t:X} \right),} \hfill \cr {\;\;\;\;\;\;\;s = l\;{\rm{if}}\;t = 0} \hfill \cr {\;\;\;\;\;\;\;\;\;\, = n\;{\rm{if}}\;t \ge 1,} \hfill \cr } where the vector X includes any variables that can be controlled for in our empirical study and ζ(t: X) captures the transitional dynamics of productivity after the M&A that is unrelated to human capital accumulation

Although we construct a model under a stationary environment in this paper, macroeconomic shocks can be easily incorporated by multiplying all variables by A(x), which denotes macroeconomic shocks in year x. Given this specification, we can show that the introduction of A(x) does not change the results at all.

.

The production technologies fGs and fTs depend on the state s. We use s = l to denote a learning state, whereas s = n denotes the normal operation state. We assume that fGshG0 {f_{Gs}}\left( {{h^G}} \right) \ge 0 and fTshT0 {f_{Ts}}\left( {{h^T}} \right) \ge 0 for any s ∈ {n, l} and hG and hT. We also assume that fTlhTfTnhT0 f_{Tl}^{\prime}\left( {{h^T}} \right) \ge f_{Tn}^{\prime}\left( {{h^T}} \right) \ge 0 and fGnhGfGlhG0 f_{Gn}^{\prime}\left( {{h^G}} \right) \ge f_{Gl}^{\prime}\left( {{h^G}} \right) \ge 0 .

The assumption of fTnhTfTlhT f_{Tn}^{\prime}\left( {{h^T}} \right) \le f_{Tl}^{\prime}\left( {{h^T}} \right) means that an increase in the target firm-specific human capital can initially increase benefits. When a merger occurs, managers in the acquiring firm do not have enough knowledge about the target firm. Hence, initially, the firm-specific knowledge relating to the target firm must be more valuable. The assumption of fTnhTfTlhT f_{Tn}^{\prime}\left( {{h^T}} \right) \le f_{Tl}^{\prime}\left( {{h^T}} \right) captures this effect.

The assumption of fGnhGfGlhG0 f_{Gn}^{\prime}\left( {{h^G}} \right) \ge f_{Gl}^{\prime}\left( {{h^G}} \right) \ge 0 may require more explanation. There are several plausible micro foundations for this assumption. Our preferred explanation is that the operation of a new firm requires new firm specific knowledge, which a manager with higher general human capital can more effectively learn. To clarify this point, we can construct an illustrative model, such as fGnhG=fGlhG+fNhN {f_{Gn}}\left( {{h^G}} \right) = {f_{Gl}}\left( {{h^G}} \right) + {f_N}\left( {{h^N}} \right) and hN=ghG {h^N} = g\left( {{h^G}} \right) , where hN is new firm-specific human capital and ghG g\left( {{h^G}} \right) is a learning function. We also assume that fNhN0 {f_N}\left( {{h^N}} \right) \ge 0 , ghG0 g\left( {{h^G}} \right) \ge 0 and ghG0 g'\left( {{h^G}} \right) \ge 0 . Under these assumptions, it can be shown that fGnhG=fGlhG+fNghGghGfGlhG f_{Gn}^{\prime}\left( {{h^G}} \right) = f_{Gl}^{\prime}\left( {{h^G}} \right) + f_N^{\prime}\left( {g\left( {{h^G}} \right)} \right)g'\left( {{h^G}} \right) \ge f_{Gl}^{\prime}\left( {{h^G}} \right) .

The assumption of ghG0 g'\left( {{h^G}} \right) \ge 0 attempts to capture the idea that general human capital can improve the speed of learning. Nelson and Phelps (1966) consider that education increases the speed of technology adoption because educated people are better able to understand new technology. Schultz (1975) argues that human intelligence can be used to interpret new information and adapt to a new environment, and refers to such intelligence as entrepreneurial ability. Gibbons and Waldman (2006) construct a model of promotion in which education can increase learning capability, arguing that this concept has significant empirical support. We attempt to incorporate this idea in our model.

The Separation of a Manager from a Target Firm after an M&A

Now, we construct a model of managerial separation. The present value of the expected profit sequences from employing a particular manager, Jt(h) at t ≥ 1, can be described as follows: Jth=pthwt+βEεIWt+1hW^lhG+Pt+1+εt+1maxJt+1h,J^0Pt+1+1IWt+1hW^lhG+Pt+1+εt+1J^0Pt+1, {J_t}\left( {\bf{h}} \right) = {p_t}\left( {\bf{h}} \right) - {w_t} + \beta {E_\varepsilon }\left[ {\matrix{ {I\left( {{W_{t + 1}}\left( {\bf{h}} \right) \ge {{\hat W}^l}\left( {{h^G}} \right) + {P_{t + 1}} + {\varepsilon _{t + 1}}} \right){\rm{\;max\;}}\left\{ {{J_{t + 1}}\left( {\bf{h}} \right),{{\hat J}^0} - {P_{t + 1}}} \right\}} \cr { + \left[ {1 - I\left( {{W_{t + 1}}\left( {\bf{h}} \right) \ge {{\hat W}^l}\left( {{h^G}} \right) + {P_{t + 1}} + {\varepsilon _{t + 1}}} \right)} \right]\left( {{{\hat J}^0} - {P_{t + 1}}} \right)} \cr } } \right], where wt is the manager's compensation at time t, Pt+1 is the severance pay at time t + 1, β ∈ (0,1) is a discount factor, ɛt is the sum of any idiosyncratic random benefits that the manager can obtain if he/she leaves the firm at time t, Eɛ[·] is the expectation operator with respect to ɛt+1, J^0 {\hat J^0} is the present value of the expected net profit sequences from employing a new manager, and Wt(h) and W^lhG {\hat W^l}\left( {{h^G}} \right) are the present values of the sum of the expected income flows when a manager stays and when he/she leaves the firm at time t, respectively. Note that when managers stay in the same firm, their value functions depend on both target firm-specific human capital and general human capital, h, whereas when they leave, their value functions depend only on their general human capital, hG.

We assume that firms review their managers every period. Assuming that the manager prefers to stay, which means that IWt+1hW^lhG+Pt+1h+εt+1=1 I\left( {{W_{t + 1}}\left( {\bf{h}} \right) \ge {{\hat W}^l}\left( {{h^G}} \right) + {P_{t + 1}}\left( {\bf{h}} \right) + {\varepsilon _{t + 1}}} \right) = 1 , the firm has to decide whether to retain the manager as a board member. If it retains the manager, it expects to obtain Jt+1(h) in future from the manager. However, if it fires the manager, the firm must find a new manager and receives J^0 {\hat J^0} . Because the firm must pay severance pay Pt+1 in the case of separation, it must incur Pt+1. Because a firm is assumed to maximize its profit, the expected profit would be maxJt+1h,J^0Pt+1 {\rm{max\;}}\left\{ {{J_{t + 1}}\left( {\bf{h}} \right),{{\hat J}^0} - {P_{t + 1}}} \right\} . If Wt+1h<W^lhG+Pt+1h+εt+1 {W_{t + 1}}\left( {\bf{h}} \right) < {\hat W^l}\left( {{h^G}} \right) + {P_{t + 1}}\left( {\bf{h}} \right) + {\varepsilon _{t + 1}} , the manager leaves, irrespective of the firm's decision and, therefore, the firm receives J^0Pt+1 {\hat J^0} - {P_{t + 1}} .

Similarly, the present values of the sum of the expected income flows of managers in the firm at the analysis time t, Wt(h), can be described as follows: Wth=wt+βEεIJt+1hJ^0Pt+1maxWt+1h,W^lhG+Pt+1+εt+1+1IJt+1hJ^0Pt+1W^lhG+Pt+1+εt+1, {W_t}\left( {\bf{h}} \right) = {w_t} + \beta {E_\varepsilon }\left[ {\matrix{ {I\left( {{J_{t + 1}}\left( {\bf{h}} \right) \ge {{\hat J}^0} - {P_{t + 1}}} \right){\rm{\;max\;}}\left\{ {{W_{t + 1}}\left( {\bf{h}} \right),{{\hat W}^l}\left( {{h^G}} \right) + {P_{t + 1}} + {\varepsilon _{t + 1}}} \right\}} \cr { + \left( {1 - I\left( {{J_{t + 1}}\left( {\bf{h}} \right) \ge {{\hat J}^0} - {P_{t + 1}}} \right)} \right)\left( {{{\hat W}^l}\left( {{h^G}} \right) + {P_{t + 1}} + {\varepsilon _{t + 1}}} \right)} \cr } } \right], when a manager stays in a firm, in every period that follows, he/she reassesses and decides whether to stay with the firm or leave. If the manager stays, he/she expects to obtain Wt+1(h). If the manager leaves, he/she expects to obtain W^lhG+Pt+1+εt+1 {\hat W^l}\left( {{h^G}} \right) + {P_{t + 1}} + {\varepsilon _{t + 1}} . Hence, if a firm decides to continue employing the manager, IJt+1hJ^0Pt+1=1 I\left( {{J_{t + 1}}\left( {\bf{h}} \right) \ge {{\hat J}^0} - {P_{t + 1}}} \right) = 1 , the expected value must be maxWt+1h,W^lhG+Pt+1+εt+1 {\rm{max\;}}\left\{ {{W_{t + 1}}\left( {\bf{h}} \right),{{\hat W}^l}\left( {{h^G}} \right) + {P_{t + 1}} + {\varepsilon _{t + 1}}} \right\} . If the firm decides to fire the manager, IJt+1hJ^0Pt+1=0 I\left( {{J_{t + 1}}\left( {\bf{h}} \right) \ge {{\hat J}^0} - {P_{t + 1}}} \right) = 0 , the manager must leave and receive the outside benefits of W^lhG+Pt+1+εt+1 {\hat W^l}\left( {{h^G}} \right) + {P_{t + 1}} + {\varepsilon _{t + 1}} , irrespective of his/her own decision.

We assume that: wt=w^t+w˜t, {w_t} = {\hat w_t} + {\tilde w_t}, where ŵt is the contracted compensation, whereas w˜tR {\tilde w_t} \in {\bf{R}} is the negotiable compensation. As suggested in the literature (e.g., Wulf, 2004, Hartzell et al., 2004, and Bargeron et al., 2009), a manager may control the sale of the firm and refuse to take a deal in the M&A process unless he/she is personally promised something in return, such as a position on the board of the new firm. As a result of this negotiation, a target-firm manager and the acquiring firm may write an explicit contract on w^t,Pt \left( {{{\hat w}_t},{P_t}} \right) before the new firm starts its operation. However, we consider that the contract in the real world is far from complete and that it leaves room for negotiable transfers w˜t {\tilde w_t} after the new firm starts. We consider that w˜t {\tilde w_t} can be a negative implicit transfer because it may involve harassment and mental pressure. We show that the existence of w˜t {\tilde w_t} makes the separation decision independent of w^t,w˜t,Pt \left( {{{\hat w}_t},{{\tilde w}_t},{P_t}} \right) .

We define the joint surplus of the manager and the new firm by SthJth+WthJ^0W^lhG {S_t}\left( {\bf{h}} \right) \equiv {J_t}\left( {\bf{h}} \right) + {W_t}\left( {\bf{h}} \right) - {\hat J^0} - {\hat W^l}\left( {{h^G}} \right) . The proof of the following proposition is provided in the Appendix.

Proposition 1

Suppose that the manager and the firm can negotiate w˜tR {\tilde w_t} \in {\bf{R}} at the beginning of time t. The separation occurs at time t if and only if St(h) < ɛt and the dynamics of St(h) can be described by: Sth=pthbhG+βSt+1h+EεIεt+1>St+1hεt+1St+1h, {S_t}\left( {\bf{h}} \right) = {p_t}\left( {\bf{h}} \right) - b\left( {{h^G}} \right) + \beta \left[ {{S_{t + 1}}\left( {\bf{h}} \right) + {E_\varepsilon }\left[ {I\left( {{\varepsilon _{t + 1}} > {S_{t + 1}}\left( {\bf{h}} \right)} \right)\left( {{\varepsilon _{t + 1}} - {S_{t + 1}}\left( {\bf{h}} \right)} \right)} \right]} \right], where bhG=1βW^lhG+J^0 b\left( {{h^G}} \right) = \left( {1 - \beta } \right)\left[ {{{\hat W}^l}\left( {{h^G}} \right) + {{\hat J}^0}} \right] is the sum of the expected flow benefits received by both a manager and a firm from the outside market.

Proposition 1 shows that the separation decision does not depend on w^t,w˜t,Pt \left( {{{\hat w}_t},{{\tilde w}_t},{P_t}} \right) , and that it depends only on the joint surplus of the target manager and the merged firm, St(h). This means that, although the negotiation during an M&A may result in a complicated compensation scheme and severance pay, it does not influence the separation decision as long as a firm and a manager can flexibly negotiate the ex post transfer.

In addition, Proposition 1 shows what determines the joint surplus. For any t, maintaining the relationship brings a joint instantaneous surplus of pth1βJ^0+W^lhG {p_t}\left( {\bf{h}} \right) - \left( {1 - \beta } \right)\left( {{{\hat J}^0} + {{\hat W}^l}\left( {{h^G}} \right)} \right) . If they maintain their relationship, the manager and firm can enjoy St+1(h) in the next period. However, if the instantaneous random benefits from the outside market happen to be greater than the joint surplus, ɛt+1 > St+1(h), then they will decide to separate and obtain the additional value of ɛt+1St+1(h). Because they can separate only when the random benefits are large enough to exceed the joint surplus, the possibility of separation simply provides each party with the additional value arising from separation. This effect is captured by EεIεt+1>St+1hεt+1St+1h {E_\varepsilon }\left[ {I\left( {{\varepsilon _{t + 1}} > {S_{t + 1}}\left( {\bf{h}} \right)} \right)\left( {{\varepsilon _{t + 1}} - {S_{t + 1}}\left( h \right)} \right)} \right] .

Assume that ɛt is exponentially distributed with a parameter λ. The separation probability, qt, is shown to be a function of St(h), as follows: qth=EεIεt>Sth=eλSth {q_t}\left( {\bf{h}} \right) = {E_\varepsilon }\left[ {I\left( {{\varepsilon _t} > {S_t}\left( {\bf{h}} \right)} \right)} \right] = {e^{ - \lambda {S_t}\left( {\bf{h}} \right)}}

This suggests that increases in the joint surplus, St(h), reduce the probability of separation.

Given this distributional assumption, the expected gain from separation is shown to be proportional to the separation probability. EεIεt+1>St+1hεt+1St+1h=qt+1hλ. {E_\varepsilon }\left[ {I\left( {{\varepsilon _{t + 1}} > {S_{t + 1}}\left( {\bf{h}} \right)} \right)\left( {{\varepsilon _{t + 1}} - {S_{t + 1}}\left( {\bf{h}} \right)} \right)} \right] = {{{q_{t + 1}}\left( {\bf{h}} \right)} \over \lambda }.

Inserting the expected gain from separation into the dynamics of the joint surplus, we can rewrite equation (1) and summarize how the two types of human capital influence the separation probability using the following equations: qth=eλSthSth=pthbhG+βSt+1h+eλSt+1hλ. \matrix{ {{q_t}\left( {\bf{h}} \right) = {e^{ - \lambda {S_t}\left( {\bf{h}} \right)}}} \hfill \cr {{S_t}\left( {\bf{h}} \right) = {p_t}\left( {\bf{h}} \right) - b\left( {{h^G}} \right) + \beta \left[ {{S_{t + 1}}\left( {\bf{h}} \right) + {{{e^{ - \lambda {S_{t + 1}}\left( {\bf{h}} \right)}}} \over \lambda }} \right].} \hfill \cr }

Note that both target firm-specific and general human capital influence the separation through an increase in productivity, pt(h). The general human capital has additional effects on the separation through a change in outside value, b(hG). Therefore, without knowing the structure of function b(hG), we cannot make any clear theoretical prediction on the impacts of general human capital on the separation probability.

The Model of the Managerial Labor Market

We model an external labor market and endogenize bhG=1βW^lhG+J^0 b\left( {{h^G}} \right) = \left( {1 - \beta } \right)\left( {{{\hat W}^l}\left( {{h^G}} \right) + {{\hat J}^0}} \right) . Suppose that an external managerial labor market is segregated by the type of worker hG. A firm can find a worker with hG by paying a hiring cost F(hG) such that: J^0=maxhGJ^lhGFhG, {\hat J^0} = \mathop {\max }\limits_{{h^G}} \left\{ {{{\hat J}^l}\left( {{h^G}} \right) - F\left( {{h^G}} \right)} \right\}, where: J^lhG=fGlhGw^lhG+βq^J^lhG+1q^J^nhG,J^nhG=fGnhGw^nhG+βq^J^lhG+1q^J^nhG. \matrix{ {{{\hat J}^l}\left( {{h^G}} \right) = {f_{Gl}}\left( {{h^G}} \right) - {{\hat w}^l}\left( {{h^G}} \right) + \beta \left[ {\hat q{{\hat J}^l}\left( {{h^G}} \right) + \left( {1 - \hat q} \right){{\hat J}^n}\left( {{h^G}} \right)} \right],} \hfill \cr {{{\hat J}^n}\left( {{h^G}} \right) = {f_{Gn}}\left( {{h^G}} \right) - {{\hat w}^n}\left( {{h^G}} \right) + \beta \left[ {\hat q{{\hat J}^l}\left( {{h^G}} \right) + \left( {1 - \hat q} \right){{\hat J}^n}\left( {{h^G}} \right)} \right].} \hfill \cr } q^ \hat q is the separation probability, and w^shG {\hat w^s}\left( {{h^G}} \right) and J^shG {\hat J^s}\left( {{h^G}} \right) are the compensation for a manager and the present value of the expected profit sequences from employing a manager who has a general skill hG at the state s, respectively. We assume that F(hG) ≥ 0 and F′(hG) ≥ 0. These assumptions imply that it is difficult to find a manager and even more difficult to find an able manager. Similar to the situation for a new firm after an M&A, there is a learning state and a normal operation state for the production function. In contrast to the situation for the new firm after an M&A, productivity does not depend on fTs(hT) and ζ(t: X). This means that target-firm specific human capital is useless and there is no transition dynamics after an M&A for the firms that do not experience M&As. As a result, the environment is stationary and the separation probability q^ \hat q does not depend on t.

Suppose that managerial talent is a scarce resource and all firms competitively post an initial wage to attract a manager as long as profits are positive. Because of the price competition, J^0=0 {\hat J^0} = 0 at the equilibrium. Hence, for all hirable hG,J^lhG=FhG {h^G},{\hat J^l}\left( {{h^G}} \right) = F\left( {{h^G}} \right) and, therefore: w^lhG=fGlhG+β1q^J^nhGFhG. {\hat w^l}\left( {{h^G}} \right) = {f_{Gl}}\left( {{h^G}} \right) + \beta \left( {1 - \hat q} \right){\hat J^n}\left( {{h^G}} \right) - F\left( {{h^G}} \right).

Similarly, the present values of the sum of the expected income flows of a manager who has a general skill hG at the state s:W^shG s:{\hat W^s}\left( {{h^G}} \right) are described as follows: W^lhG=w^lhG+βq^W^lhG+1q^W^nhG {\hat W^l}\left( {{h^G}} \right) = {\hat w^l}\left( {{h^G}} \right) + \beta \left[ {\hat q{{\hat W}^l}\left( {{h^G}} \right) + \left( {1 - \hat q} \right){{\hat W}^n}\left( {{h^G}} \right)} \right] W^nhG=w^nhG+βq^W^lhG+1q^W^nhG. {\hat W^n}\left( {{h^G}} \right) = {\hat w^n}\left( {{h^G}} \right) + \beta \left[ {\hat q{{\hat W}^l}\left( {{h^G}} \right) + \left( {1 - \hat q} \right){{\hat W}^n}\left( {{h^G}} \right)} \right].

We define the surplus under a normal operation as S^nhG=J^nhG+W^nhGW^lhG {\hat S^n}\left( {{h^G}} \right) = {\hat J^n}\left( {{h^G}} \right) + {\hat W^n}\left( {{h^G}} \right) - {\hat W^l}\left( {{h^G}} \right) . Then, it is shown that: S^nhG=fGnhGfGlhG+FhG0. {\hat S^n}\left( {{h^G}} \right) = {f_{Gn}}\left( {{h^G}} \right) - {f_{Gl}}\left( {{h^G}} \right) + F\left( {{h^G}} \right) \ge 0.

Note that S^nhG0 {\hat S^n}\left( {{h^G}} \right) \ge 0 means that there is a surplus that managers and firms can share. Hence, they can adjust w^nhG {\hat w^n}\left( {{h^G}} \right) to maintain their relationship. Note also that S^nhG0 {\hat S^n}\left( {{h^G}} \right) \ge 0 as long as learning increases productivity fGnhG=fGlhG+fNghGghGfGlhG f_{Gn}^{\prime}\left( {{h^G}} \right) = f_{Gl}^{\prime}\left( {{h^G}} \right) + f_N^{\prime}\left( {g\left( {{h^G}} \right)} \right)g'\left( {{h^G}} \right) \ge f_{Gl}^{\prime}\left( {{h^G}} \right) and there is a hiring cost FhG0 F\left( {{h^G}} \right) \ge 0 .

Because J^0=0 {\hat J^0} = 0 at the equilibrium, bhG=1βW^lhG b\left( {{h^G}} \right) = \left( {1 - \beta } \right){\hat W^l}\left( {{h^G}} \right) . Substituting equation (2) into equation (3), we can derive: bhG=fGlhGFhG+β1q^S^nhG. b\left( {{h^G}} \right) = {f_{Gl}}\left( {{h^G}} \right) - F\left( {{h^G}} \right) + \beta \left( {1 - \hat q} \right){\hat S^n}\left( {{h^G}} \right).

Using this result, we can summarize the separation probability derived from our general equilibrium model of the managerial separation after an M&A in the following proposition.

Proposition 2

The separation probability of a target manager after an M&A is characterized by the following equations: qth=eλSthSth=πsh+ζt:X+βSt+1h+eλSt+1hλ \matrix{\, {{q_t}\left( {\bf{h}} \right) = {e^{ - \lambda {S_t}\left( {\bf{h}} \right)}}} \hfill \cr {{S_t}\left( {\bf{h}} \right) = {\pi ^s}\left( {\bf{h}} \right) + \zeta \left( {t:{\bf{X}}} \right) + \beta \left[ {{S_{t + 1}}\left( {\bf{h}} \right) + {{{e^{ - \lambda {S_{t + 1}}\left( {\bf{h}} \right)}}} \over \lambda }} \right]} \hfill \cr } where: πnhfTnhT+1β1q^S^nhGπlhfTlhT+FhGβ1q^S^nhGS^nhG=fGnhGfGlhG+FhG. \matrix{ {\;\;\;{\pi ^n}\left( {\bf{h}} \right) \equiv {f_{Tn}}\left( {{h^T}} \right) + \left[ {1 - \beta \left( {1 - \hat q} \right)} \right]{{\hat S}^n}\left( {{h^G}} \right)} \hfill \cr {\;\;\;\,{\pi ^l}\left( {\bf{h}} \right) \equiv {f_{Tl}}\left( {{h^T}} \right) + F\left( {{h^G}} \right) - \beta \left( {1 - \hat q} \right){{\hat S}^n}\left( {{h^G}} \right)} \hfill \cr {{{\hat S}^n}\left( {{h^G}} \right) = {f_{Gn}}\left( {{h^G}} \right) - {f_{Gl}}\left( {{h^G}} \right) + F\left( {{h^G}} \right).} \hfill \cr }

The proposition shows that not only the production function but also the hiring cost function influences the separation productivity of the target manager after an M&A.

Proportional Hazard Model of Separation

We show how our structural model can be approximately represented by the proportional hazards model. The advantage of the proportional hazards model is that we do not need any parametric assumption on the basic hazard and, therefore, the model does not make any assumptions on the shape of the hazard over time. It is shown that without making any assumption on ζ(t: X), we can conduct our empirical analysis.

We assume that after the merger, a new firm faces a transition period and eventually converges to an operation state that is similar to those of other firms that did not experience an M&A. We derive an approximate solution of St(h). We decompose X and ζ(t: X) into two parts: X = (XZ, XV), νs (XV) and z (t:Xz) so that ζ(t: X) = νs (XV) + z (t: Xz), where s = {l, n}. Taking a first-order approximation of eλSt+1hλ {{{e^{ - \lambda {S_{t + 1}}\left( {\bf{h}} \right)}}} \over \lambda } around eλSt+1h=q^ {e^{ - \lambda {S_{t + 1}}\left( {\bf{h}} \right)}} = \hat q , we can derive an approximate solution of St(h) for t ≥ 1: SthΠnh+Zt:XZΠnh=πnh+vnXV1β1q^Zt:XZ=τ=0[β1q^]τzt+τ:XZ+β^q^1β1q^, \matrix{ {\;\;\;\;\;\;\;{S_t}\left( {\bf{h}} \right) \approx {\Pi ^n}\left( {\bf{h}} \right) + Z\left( {t:{{\bf{X}}_Z}} \right)} \hfill \cr {\;\;\,\;\;\;{\Pi ^n}\left( {\bf{h}} \right) = {{{\pi ^n}\left( {\bf{h}} \right) + {v^n}\left( {{{\bf{X}}_V}} \right)} \over {1 - \beta \left( {1 - \hat q} \right)}}} \hfill \cr {Z\left( {t:{{\bf{X}}_Z}} \right) = \sum\limits_{\tau = 0}^\infty {{{[\beta \left( {1 - \hat q} \right)]}^\tau }z\left( {t + \tau :{{\bf{X}}_Z}} \right)} + {{\hat \beta \left( {\hat q} \right)} \over {1 - \beta \left( {1 - \hat q} \right)}},} \hfill \cr } where β^q^=β1lnq^λq^ \hat \beta \left( {\hat q} \right) = \beta \left( {{{1 - {\rm{\;ln\;}}\hat q} \over \lambda }} \right)\hat q .

Similarly, the present value of the stream of instantaneous joint surpluses at time 0, S0(h) can be expressed as: S0h=πlh+vlXV+z0:XZ+βS1h+eλS1hλ. {S_0}\left( {\bf{h}} \right) = {\pi ^l}\left( {\bf{h}} \right) + {v^l}\left( {{{\bf{X}}_V}} \right) + z\left( {0:{{\bf{X}}_Z}} \right) + \beta \left[ {{S_1}\left( {\bf{h}} \right) + {{{e^{ - \lambda {S_1}\left( {\bf{h}} \right)}}} \over \lambda }} \right].

Therefore, using the same approximation, we can show that: S0hΔπh+Πnh+Z0:XZ, {S_0}\left( {\bf{h}} \right) \approx \Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right) + Z\left( {0:{{\bf{X}}_Z}} \right), where Δπh=πlhπnh+vlXVvnXV \Delta \pi \left( {\bf{h}} \right) = {\pi ^l}\left( {\bf{h}} \right) - {\pi ^n}\left( {\bf{h}} \right) + {v^l}\left( {{{\bf{X}}_V}} \right) - {v^n}\left( {{{\bf{X}}_V}} \right) .

The following proposition summarizes the results.

Proposition 3

Separation probability can be estimated using the following proportional hazards model: qt=Bt:XZeλΔπhIt=0+ΠnhΠnh=πnh+vnXV1β1q^, \matrix{ {\;\;\;\;\;\;\,\,{q_t} = B\left( {t:{{\bf{X}}_Z}} \right){e^{ - \lambda \left[ {\Delta \pi \left( {\bf{h}} \right)I\left( {t = 0} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right]}}} \hfill \cr {{\Pi ^n}\left( {\bf{h}} \right) = {{{\pi ^n}\left( {\bf{h}} \right) + {v^n}\left( {{{\bf{X}}_V}} \right)} \over {1 - \beta \left( {1 - \hat q} \right)}},} \hfill \cr } where: Δπh=fGlhGfGnhG+fTlhTfTnhT+vlXVvnXV,πnh=fTnhT+1β1q^fGnhGfG0hG+FhG, \matrix{ {\Delta \pi \left( {\bf{h}} \right) = {f_{Gl}}\left( {{h^G}} \right) - {f_{Gn}}\left( {{h^G}} \right) + {f_{Tl}}\left( {{h^T}} \right) - {f_{Tn}}\left( {{h^T}} \right) + {v^l}\left( {{{\bf{X}}_V}} \right) - {v^n}\left( {{{\bf{X}}_V}} \right),} \hfill \cr {\;{\pi ^n}\left( {\bf{h}} \right) = {f_{Tn}}\left( {{h^T}} \right) + \left[ {1 - \beta \left( {1 - \hat q} \right)} \right]\left[ {{f_{Gn}}\left( {{h^G}} \right) - {f_{G0}}\left( {{h^G}} \right) + F\left( {{h^G}} \right)} \right],} \hfill \cr }

B (t: (XZ) = eλZ(t:XZ) and I(t = 0) = 1 when t = 0 and I(t = 0) = 0 otherwise.

We assume that the acquiring firm appoints the manager in the target firm to the board of the new firm at the beginning of t = 0. Because the probability of being appointed a manager is equivalent to 1 − q0, investigating the separation probability in the initial period provides information about the appointment probability as well. As we discussed, we derive the basic hazard B(t: Xz) as a function of z(t + τ: Xz). Therefore, without making any parametric assumption on z(t + τ: Xz), we can conduct the estimation of this proportional hazards model.

Let us examine how target firm-specific and general human capital influence the parameters on the proportional hazards model. Because, for x = G or T, qthx=qtλΠnhhx {{\partial {q_t}} \over {\partial {h^x}}} = {q_t}\left[ { - \lambda {{\partial {\Pi ^n}\left( {\bf{h}} \right)} \over {\partial {h^x}}}} \right] for any t ≥ 1 and q0hx=q0λΔπhhx+Πnhhx {{\partial {q_0}} \over {\partial {h^x}}} = {q_0}\left[ { - \lambda \left( {{{\partial \Delta \pi \left( {\bf{h}} \right)} \over {\partial {h^x}}} + {{\partial {\Pi ^n}\left( {\bf{h}} \right)} \over {\partial {h^x}}}} \right)} \right] , we can analyze the impact of human capital on the separation rate by λΠnhhx {{\partial - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \over {\partial {h^x}}} , λΔπhhx {{\partial - \lambda \Delta \pi \left( {\bf{h}} \right)} \over {\partial {h^x}}} and λΔπh+Πnhhx {{\partial - \lambda \left[ {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right]} \over {\partial {h^x}}} , where x = G or T. We can obtain the following proposition.

Proposition 4

The impacts of target firm-specific human capital and general human capital on the proportional hazards model can be analyzed by the following equations:

The impacts of target firm-specific human capital: λΠnhhT=λfTnhT1β1q^0λΔπhhT=λfTlhTfTnhT0λΔπh+ΠnhhT=λfTlhT+β1q^fTnhT1β1q^0 \matrix{ {\;\;\;\;\;\;\;\;\;\;\;{{\partial - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \over {\partial {h^T}}} = {{ - \lambda f_{Tn}^{\prime}\left( {{h^T}} \right)} \over {1 - \beta \left( {1 - \hat q} \right)}} \le 0} \hfill \cr {\,\;\;\;\;\;\;\;\;\;\;{{\partial - \lambda \Delta \pi \left( {\bf{h}} \right)} \over {\partial {h^T}}} = - \lambda \left[ {f_{Tl}^{\prime}\left( {{h^T}} \right) - f_{Tn}^{\prime}\left( {{h^T}} \right)} \right] \le 0} \hfill \cr {{{\partial - \lambda \left[ {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right]} \over {\partial {h^T}}} = - \lambda \left[ {f_{Tl}^{\prime}\left( {{h^T}} \right) + {{\beta \left( {1 - \hat q} \right)f_{Tn}^{\prime}\left( {{h^T}} \right)} \over {1 - \beta \left( {1 - \hat q} \right)}}} \right] \le 0} \hfill \cr }

2. The impacts of general human capital: λΠnhhG=λfGnhGfGlhG+FhG0λΔπhhG=λfGlhGfGnhG0λΔπh+ΠnhhG=λFhG0 \matrix{ {\;\;\;\;\;\;\;\;\;\;\;{{\partial - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \over {\partial {h^G}}} = - \lambda \left[ {f_{Gn}^{\prime}\left( {{h^G}} \right) - f_{Gl}^{\prime}\left( {{h^G}} \right) + F'\left( {{h^G}} \right)} \right] \le 0} \hfill \cr {\,\;\;\;\;\;\;\;\;\;\;{{\partial - \lambda \Delta \pi \left( {\bf{h}} \right)} \over {\partial {h^G}}} = - \lambda \left[ {f_{Gl}^{\prime}\left( {{h^G}} \right) - f_{Gn}^{\prime}\left( {{h^G}} \right)} \right] \ge 0} \hfill \cr {{{\partial - \lambda \left[ {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right]} \over {\partial {h^G}}} = - \lambda F'\left( {{h^G}} \right) \le 0} \hfill \cr }

The first result in Proposition 4 shows that a rise in hT increases Πn(h) and Δπ(h) + Πn(h), and, therefore, always decreases the separation probability of managers. That is, because target firm-specific human capital always improves productivity, fTlhT0 f_{Tl}^{\prime}\left( {{h^T}} \right) \ge 0 and fTnhT0 f_{Tn}^{\prime}\left( {{h^T}} \right) \ge 0 . A rise in hT also increases Δπ(h) and, therefore, lowers −λΔπ(h). This is because we assume that the target firm-specific human capital is initially more important, fTlhTfTnhT f_{Tl}^{\prime}\left( {{h^T}} \right) \ge f_{Tn}^{\prime}\left( {{h^T}} \right) .

The impacts of general human capital are more complicated. Because of the competition for general human capital, a rise in this type of human capital increases not only productivity, but also the cost of employing managers. Therefore, how general human capital influences the separation probability appears to be unclear. However, the second results in Proposition 4 provide clear theoretical predictions. That is, a rise in hG increases Πn(h) and Δπ(h) + Πn(h), and, therefore, always decreases the separation probability of managers. A rise in hG increases Δπ(h) + Πn(h) and, therefore, reduces q0 if it is difficult to find more able managers FhG0 F'\left( {{h^G}} \right) \ge 0 . If the costs of finding talented managers are the same as those of finding ordinary managers, there are no benefits from employing managers with high general human capital because the competition for talent forces firms to pay high compensation rates for talented managers. The benefits of having managers with high general human capital increases in the later period if employing someone with high learning capability can be considered an investment fGlhGfGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) \le f_{Gn}^{\prime}\left( {{h^G}} \right) . If managers have a greater ability to learn in the new environment, this skill increases the firm's future outputs. However, it does not increase the initial output during a learning period. Because Δπ(h) simply reflects the initial instantaneous output, net of the future instantaneous output yielded by the manager, the impact of learning capability on Δπ(h) is negative. This unique prediction helps us to separate general learning capability from target firm-specific skills.

Tenure and Separation Probability

In this section, we explain how we utilize the timing of separation to obtain information to distinguish two hypotheses on managerial human capital in a firm (i.e., an increase in target firm specific human capital and a reduction in the general ability to learn in a new environment) from the coefficients of the effect of tenure on separation probability. For this purpose, we propose testable hypotheses to interpret the relationship between tenure and separation probability.

Taking a first-order approximation, it is shown that: Πnh=α˜ThT+α˜GhG+α˜XXV, {\Pi ^n}\left( {\bf{h}} \right) = {\tilde \alpha _T}{h^T} + {\tilde \alpha _G}{h^G} + \tilde \alpha _X^{\prime}{{\bf{X}}_V}, Δπh=α¯ThTα¯GhG+α¯XXV, \Delta \pi \left( {\bf{h}} \right) = {\bar \alpha _T}{h^T} - {\bar \alpha _G}{h^G} + \bar \alpha _X^{\prime}{{\bf{X}}_V}, where α˜T=fTnhT1β1q^0 {\tilde \alpha _T} = {{f_{Tn}^{\prime}\left( {{h^T}} \right)} \over {1 - \beta \left( {1 - \hat q} \right)}} \ge 0 , α˜G=fGnhGfGlhG+FhG0 {\tilde \alpha _G} = f_{Gn}^{\prime}\left( {{h^G}} \right) - f_{Gl}^{\prime}\left( {{h^G}} \right) + F'\left( {{h^G}} \right) \ge 0 , α˜X=vnxVXV1β1q^ {\tilde \alpha _X} = {{{{\partial {v^n}\left( {{{\bf{x}}_V}} \right)} \over {\partial {{\bf{X}}_V}}}} \over {1 - \beta \left( {1 - \hat q} \right)}} , α¯T=fTlhTfTnhT0 {\bar \alpha _T} = f_{Tl}^{\prime}\left( {{h^T}} \right) - f_{Tn}^{\prime}\left( {{h^T}} \right) \ge 0 , α¯G=fGnhGfGlhG0 {\bar \alpha _G} = f_{Gn}^{\prime}\left( {{h^G}} \right) - f_{Gl}^{\prime}\left( {{h^G}} \right) \ge 0 , and α¯X=vlXVXVvnXVXV {\bar \alpha _X} = {{\partial {v^l}\left( {{{\bf{X}}_V}} \right)} \over {\partial {{\bf{X}}_V}}} - {{\partial {v^n}\left( {{{\bf{X}}_V}} \right)} \over {\partial {{\bf{X}}_V}}} .

Let τb and τe denote the manager's tenure as a board member and as an employee in the target firm, respectively. Following the tradition of labor economics, we assume that the target firm's firm-specific human capital is an increasing function of these tenures. hT=η0+ηbτb+ηeτe {h^T} = {\eta _0} + {\eta _b}{\tau _b} + {\eta _e}{\tau _e}

We assume that ηx ≥ 0 for x = b or e, which means that increases in both types of tenures assist in the accumulation of firm-specific human capital

Our human capital accumulation functions are linear if they do not change positions. Since a more general production function would greatly complicate our empirical predictions, we made the linear assumption in order to derive a meaningful interpretation from the empirical results. This assumption can be thought of as a linear approximation to the general human capital accumulation function.

.

On the other hand, as hG is a general skill, labor economists typically assume that hG increases not only as a result of experience in the target firm, i.e., as a result of both τb and τe, but also as a result of the duration of other experiences, τ0. Therefore, we have: hG=ω0+ωbτb+ωeτe+μτo+χ {h^G} = {\omega _0} + {\omega _b}{\tau _b} + {\omega _e}{\tau _e} + \mu {\tau _o} + \chi where ωx ≥ 0(x = b,e) are the parameters that capture the importance of tenure for learning and μ ≥ 0 is the parameter that captures the importance of other experiences

Our human capital accumulation function is consistent with the following human capital accumulation function. For any year t htT=ht1T+ηbifaworkerisboardmember,=ht1T+ηe,ifaworkerisanemployee, \matrix{ {h_t^T = h_{t - 1}^T + {\eta _b}\;{\rm{if}}\;{\rm{a}}\;{\rm{worker}}\;{\rm{is}}\;{\rm{board}}\;{\rm{member}},} \hfill \cr {\;\;\;\; = h_{t - 1}^T + {\eta _e},\;{\rm{if}}\;{\rm{a}}\;{\rm{worker}}\;{\rm{is}}\;{\rm{an}}\;{\rm{employee}},} \hfill \cr } and htG=ht1G+ωb,ifaworkerisboardmember,=ht1G+ωe,ifaworkerisanemployee,=ht1G+ωo,ifaworkerdoesnotbelongtothefirm. \matrix{ {h_t^G = h_{t - 1}^G + {\omega _b},\;{\rm{if}}\;{\rm{a}}\;{\rm{worker}}\;{\rm{is}}\;{\rm{board}}\;{\rm{member,}}} \hfill \cr {\;\;\;\; = h_{t - 1}^G + {\omega _e},\;{\rm{if}}\;{\rm{a}}\;{\rm{worker}}\;{\rm{is}}\;{\rm{an}}\;{\rm{employee,}}} \hfill \cr {\;\;\;\; = h_{t - 1}^G + {\omega _o},\;{\rm{if}}\;{\rm{a}}\;{\rm{worker}}\;{\rm{does}}\;{\rm{not}}\;{\rm{belong}}\;{\rm{to}}\;{\rm{the}}\;{\rm{firm}}{\rm{.}}} \hfill \cr }

. This model generalizes the standard assumption, in that it allows for different experiences to vary in their effectiveness in accumulating general human capital. Estimating τo by τo = aτbτe, where a is the manager's age when the M&A takes place, we can rewrite hG as a function of tenures and age: hG=ω0+ωbμτb+ωeμτe+μa+χ. {h^G} = {\omega _0} + \left( {{\omega _b} - \mu } \right){\tau _b} + \left( {{\omega _e} - \mu } \right){\tau _e} + \mu a + \chi .

The parameters ωxμ (x = b,e) capture the relative productivity of tenure for the accumulation of general human capital compared with other experiences. Buchholtz et al. (2003) argue that longer tenure may hamper the accumulation of general skills that are needed for adaptation to a new environment. This implies that ωxμ < 0. Although this might be a reasonable assumption, we also allow for the possibility that ωxμ > 0: i.e., that experience in a target firm can assist in improving general human capital. Several studies on spin-off effects suggest that previous experience in incumbent firms can be an important source of experience in establishing new firms (e.g., Klepper, 2001). It would be possible to apply a similar reasoning in this context. This possibility is captured by ωxμ > 0.

The following proportional hazards model can be derived: qt=Bt:XZeλΔπhIt=0+Πnh,Πnh=α˜ThT+α˜GhG+α˜XXV  Δπh=α¯ThTα¯GhG+α¯XXVhT=ηbτb+ηeτehG=ωbμτb+ωeμτe+μa+χ. \matrix{ {\;\;\;\;\;\;\;\;\;{q_t} = B\left( {t:{{\bf{X}}_Z}} \right){e^{ - \lambda \left[ {\Delta \pi \left( {\bf{h}} \right)I\left( {t = 0} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right]}},} \hfill \cr {\;\,{\Pi ^n}\left( {\bf{h}} \right) = {{\tilde \alpha }_T}{h^T} + {{\tilde \alpha }_G}{h^G} + \tilde \alpha _X^{\prime}{{\bf{X}}_V}} \hfill \cr {\;\Delta \pi \left( {\bf{h}} \right) = {{\bar \alpha }_T}{h^T} - {{\bar \alpha }_G}{h^G} + \bar \alpha _X^{\prime}{{\bf{X}}_V}} \hfill \cr {\,\;\;\;\;\;\;\;{h^T} = {\eta _b}{\tau _b} + {\eta _e}{\tau _e}} \hfill \cr {\;\;\;\;\;\;\;\,{h^G} = \left( {{\omega _b} - \mu } \right){\tau _b} + \left( {{\omega _e} - \mu } \right){\tau _e} + \mu a + \chi .} \hfill \cr }

Using this derived hazard function, the theoretically predicted effects of tenure on separation probability are summarized as follows: dλΠnhdτx=λα˜Tηx+α˜Gωxμ, {{d\left( { - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}} = - \lambda \left[ {{{\tilde \alpha }_T}{\eta _x} + {{\tilde \alpha }_G}\left( {{\omega _x} - \mu } \right)} \right], dλΔπhdτx=λα¯Tηxα¯Gωxμ, {{d\left( { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}} = - \lambda \left[ {{{\bar \alpha }_T}{\eta _x} - {{\bar \alpha }_G}\left( {{\omega _x} - \mu } \right)} \right], λΔπh+Πnhτx=λα˜T+α¯Tηx+α˜Gα¯G(ωxμ, {{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _x}}} = - \lambda \left[ {\left( {{{\tilde \alpha }_T} + {{\bar \alpha }_T}} \right){\eta _x} + \left( {{{\tilde \alpha }_G} - {{\bar \alpha }_G}} \right)({\omega _x} - \mu } \right], where x = b or e.

Investigating equations (7), (8), and (9) provides the following propositions that we wish to reject in our empirical study. The first proposition below shows the set of conditions under which a newly merged firm does not value any specific skills of employees in the target firm and/or any ability to learn in a new environment, and under which a market friction does not depend on the general human capital.

Proposition 5

We assume that ηx ≥ 0 for both x = b and e.

Suppose that fGlhGfGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) \le f_{Gn}^{\prime}\left( {{h^G}} \right) and FhG0 F'\left( {{h^G}} \right) \ge 0 and, therefore, α˜Gα¯G0 {\tilde \alpha _G} \ge {\bar \alpha _G} \ge 0 . If fTnhT=fTlhT=0 f_{Tn}^{\prime}\left( {{h^T}} \right) = f_{Tl}^{\prime}\left( {{h^T}} \right) = 0 , and, therefore, α˜T=α¯T=0 {\tilde \alpha _T} = {\bar \alpha _T} = 0 , then, for both x = b and e, we have: signdλΠnhdτx=signdλΔπhdτx=signλΔπh+Πnhτx. {\rm{sign}}\left[ {{{d\left( { - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}}} \right] = - {\rm{sign}}\left[ {{{d\left( { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}}} \right] = {\rm{sign}}\left[ {{{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _x}}}} \right].

Suppose that fTlhTfTnhT0 f_{Tl}^{\prime}\left( {{h^T}} \right) \ge f_{Tn}^{\prime}\left( {{h^T}} \right) \ge 0 and, therefore, α¯Tα˜T0 {\bar \alpha _T} \ge {\tilde \alpha _T} \ge 0

If fGlhG=fGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) = f_{Gn}^{\prime}\left( {{h^G}} \right) and, therefore, α¯G=0 {\bar \alpha _G} = 0 , then, for both x = b and e, we have: dλΔπhdτx0 {{d\left( { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}} \le 0

If FhG=0 F'\left( {{h^G}} \right) = 0 and, therefore, α˜Gα¯G=0 {\tilde \alpha _G} - {\bar \alpha _G} = 0 , then, for both x = b and e, we have: λΔπh+Πnhτx0 {{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _x}}} \le 0

If fGlhG=fGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) = f_{Gn}^{\prime}\left( {{h^G}} \right) and FhG=0 F'\left( {{h^G}} \right) = 0 and, therefore, α˜G=α¯G=0 {\tilde \alpha _G} = {\bar \alpha _G} = 0 , then, for both x = b and e, we have: dλΠnhdτx0,dλΔπhdτx0,λΔπh+Πnhτx0 {{d\left( { - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}} \le 0,{{d\left( { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}} \le 0,{{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _x}}} \le 0

The intuition behind equation (10) in Proposition 5 is explained as follows. If target firm-specific human capital does not increase productivity, fTnhT=fTlhT=0 f_{Tn}^{\prime}\left( {{h^T}} \right) = f_{Tl}^{\prime}\left( {{h^T}} \right) = 0 , an increase in tenure influences the separation probability only because it can change the learning capability of managers. If ωxμ, where x = b or e, because an increase in tenure assists in improving general human capital relative to other work experience, it increases the productivity of the manager in the newly merged firm and, therefore, lowers the separation probability at all times. However, because learning takes time, the benefits from a general skill that improves the learning capability are not initially realized. Hence, the initial products of tenure, net of future products, are negative and the coefficient of the interaction with the t = 0 dummy must be positive. However, if ωxμ, the opposite mechanism occurs. Because an increase in tenure reduces the experiences that improve general skills in this case, it simply lowers the productivity of the manager in a merged company and, therefore, increases the separation probability at all times. However, because the initial productivity of the manager after the M&A is not subject to this negative effect, the initial products of tenure, net of future products, are positive. Therefore, the coefficient of the interaction with the t = 0 dummy must be negative. Hence, for both cases, equation (10) must be satisfied.

Similarly, we can explain the intuition behind equations (11) (12) and (13) in Proposition 5 as follows. If there are no benefits from learning capability fGlhG=fGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) = f_{Gn}^{\prime}\left( {{h^G}} \right) , an improvement in general human capital does not cause any productivity difference between the initial period and the later period. Because the human capital specific to the target firm possessed by the target-firm manager is more important before the managers in the acquiring firm learn the same skills, it would be more highly valued at t = 0. Therefore, the coefficient of the interaction with the t = 0 dummy must be negative. Equation (11) shows this condition.

If the cost of hiring a new manager does not depend on the general human capital of the manager, FhG=0 F'\left( {{h^G}} \right) = 0 , as a result of market competition, there are no benefits from employing a manager with high general human capital. Therefore, as an increase in tenure raises target firm-specific human capital, it certainly lowers the initial separation probability. Equation (12) shows this condition.

If there are no benefits from learning capability fGlhG=fGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) = f_{Gn}^{\prime}\left( {{h^G}} \right) , and the cost of hiring a new manager does not depend on the general human capital of the manager, FhG=0 F'\left( {{h^G}} \right) = 0 , an increase in tenure changes the separation probability only because it can raise the human capital specific to the target firm. In addition to this, for the reason explained above, the coefficient of the interaction of tenure with the t = 0 dummy must be negative. Equation (13) summarizes these conditions.

Note that Proposition 5 requires that equations (10) and (11) are jointly satisfied by the coefficients for both the tenure as a board member, x = b, and the tenure as an employee, x = e. Because an acquiring firm does not care about how a target manager obtains a skill, if the skills are not profitable within the merged firm, the coefficients of tenure as a board member and as an employee will be influenced in a similar way. In other words, if the derived conditions are not satisfied for either x = b or x = e, we can reject the hypotheses in Proposition 5.

On the other hand, the difference between the coefficients for the two types of tenure, x = b and x = e, provides us with information about the differences in skills obtained as a result of experience gained as a board member or as an employee. The following proposition summarizes the hypotheses about specific human capital and experience that we wish to reject.

Proposition 6

Suppose that fTlhTfTnhT0 f_{Tl}^{\prime}\left( {{h^T}} \right) \ge f_{Tn}^{\prime}\left( {{h^T}} \right) \ge 0 , and, therefore, α¯Tα˜T0 {\bar \alpha _T} \ge {\tilde \alpha _T} \ge 0 . Suppose that fGlhGfGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) \le f_{Gn}^{\prime}\left( {{h^G}} \right) and FhG0 F'\left( {{h^G}} \right) \ge 0 and, therefore, α˜Gα¯G0 {\tilde \alpha _G} \ge {\bar \alpha _G} \ge 0 . If ηx = 0, which means that experience gained during x, where x = b or e, does not raise human capital specific to the target firm, then: signdλΠnhdτx=signdλΔπhdτx=signλΔπh+Πnhτx. {\rm{sign}}\left[ {{{d\left( { - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}}} \right] = - {\rm{sign}}\left[ {{{d\left( { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}}} \right] = {\rm{sign}}\left[ {{{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _x}}}} \right].

Equation (14) looks very similar to equation (10). The only difference is that Proposition 5 requires that equation (10) must be jointly satisfied for both x = e and x = b, whereas Proposition 6 states that we can separately use equation (14) for x = e and x = b. The intuition behind equation (14) is the same as that behind equation (10). However, by separately applying the same logic to the coefficients for the tenure as a board member and the tenure as an employee, we can determine differences in skills obtained from experience as a board member compared with experience as an employee.

Finally, the following proposition summarizes the hypotheses about general human capital and experience that we wish to reject.

Proposition 7

Suppose that fTlhTfTnhT0 f_{Tl}^{\prime}\left( {{h^T}} \right) \ge f_{Tn}^{\prime}\left( {{h^T}} \right) \ge 0 , and, therefore, α¯Tα˜T0 {\bar \alpha _T} \ge {\tilde \alpha _T} \ge 0 . Suppose that fGlhGfGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) \le f_{Gn}^{\prime}\left( {{h^G}} \right) and FhG0 F'\left( {{h^G}} \right) \ge 0 and, therefore, α˜Gα¯G0 {\tilde \alpha _G} \ge {\bar \alpha _G} \ge 0 .

If ωxμ, which means that a long tenure during x, where x = b or e, does not hamper the accumulation of general human capital relative to other experiences, then: dλnhdτx0,λΔπh+nhτx0. {{d\left( { - \lambda {\prod ^n}\left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}} \le 0,{{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\prod ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _x}}} \le 0.

If ωxμ, which means that a long tenure during x, where x = b or e, does not assist in the accumulation of general human capital relative to other experiences, then: dλΔπhdτx0. {{d\left( { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right)} \over {d{\tau _x}}} \le 0.

If ωxμ, an increase in tenure assists in the accumulation of general human capital. Because an increase in tenure also increases valuable human capital specific to the target firm, a long tenure is valued by the newly merged firm and, therefore, it lowers the separation probability at all times.

However, if ωxμ, the opposite effect occurs. Because the human capital specific to the target firm is initially more highly valued, and because limited learning abilities do not initially cause any problem, the initial value of tenure is larger than its subsequent value. Therefore, a definite prediction of our theory is that the coefficient of tenure interacted with the t = 0 dummy must be negative. Equipped with the theoretical predictions outlined in this section, we now conduct our empirical study.

Dealing with Unobserved Heterogeneity

Suppose that XV can be decomposed into the characteristics of the target-firm manager and those of the firm to which the manager belongs. To analyze the determinants of the retention rate of target managers in a newly merged firm, we use the following stratified Cox proportional hazards model, which summarizes our theory: qfjt=B^t:XZ,feλΔπfjIt=0+ΠfjΠfj=α˜bτb,fj+α˜eτe,fj+α˜DXD,f,j+α˜FXF,f+α˜χχfjΔπfj=α¯bτb,fj+α¯eτbf,j+α¯DXD,f,j+α¯FXF,f+α¯χχfj, \matrix{ {\;\;{q_{fjt}} = \hat B\left( {t:{{\bf{X}}_{Z,f}}} \right){e^{ - \lambda \left[ {\Delta {\pi _{fj}}I\left( {t = 0} \right) + {\Pi _{fj}}} \right]}}} \hfill \cr {\;\;{\Pi _{fj}} = {{\tilde \alpha }_b}{\tau _{b,fj}} + {{\tilde \alpha }_e}{\tau _{e,fj}} + \tilde \alpha _D^{\prime}{{\bf{X}}_{D,f,j}} + \tilde \alpha _F^{\prime}{{\bf{X}}_{F,f}} + {{\tilde \alpha }_\chi }{\chi _{fj}}} \hfill \cr {\Delta {\pi _{fj}} = {{\bar \alpha }_b}{\tau _{b,fj}} + {{\bar \alpha }_e}{\tau _{bf,j}} + \bar \alpha _D^{\prime}{{\bf{X}}_{D,f,j}} + \bar \alpha _F^{\prime}{{\bf{X}}_{F,f}} + {{\bar \alpha }_\chi }{\chi _{fj}},} \hfill \cr } where τb,fj is the j th person's tenure as a board member at firm f, τe,fj is the j th person's tenure as an employee at firm f, the vector XD,f,j contains any other target-firm manager characteristics of the j th person in firm f, including the j th person's age when firm f was taken over, and the vector XF,f contains any characteristics of the fth firm that can influence the productivity of managers. Finally, the vector Xz,f contains any firm-level variables that can influence baseline hazard and B^t:XZ,f \hat B\left( {t:{{\bf{X}}_{Z,f}}} \right) contains Bt:XZ,f B\left( {t:{{\bf{X}}_{Z,f}}} \right) and some constant terms.

This equation depends on the individual unobserved heterogeneity of a manager j at firm f, χfj. This unobserved heterogeneity may be the result of unobserved capabilities, or it may be the relationship with the founding family, main bank, or parent company. We do not know the cause of this heterogeneity, but we assume that the parameter χfj can summarize the effect of these heterogeneities. Without controlling for χfj, our estimates might be biased. As explained in footnote 3, we focus on titled directors in Japanese companies. However, talented workers may be appointed to titled director positions with less experience than less talented workers. Hence, χfj and τb,fj can be correlated.

In order to deal with this bias, we estimate unobserved abilities from our data and control for them in our survival analysis. The following argument explains how to estimate χfj.

Suppose that a person is appointed to a management position (titled director) in Japan if and only if: h^fjt^HNhf,t^,h^fjt^1ωhfjTt^+ωhfjGt^ \matrix{ {{{\hat h}_{fj}}\left( {\hat t} \right) \ge {\bf{H'}}{{\bf{N}}_h}\left( {f,\hat t} \right),} \hfill \cr {{{\hat h}_{fj}}\left( {\hat t} \right) \equiv \left( {1 - \omega } \right)h_{fj}^T\left( {\hat t} \right) + \omega h_{fj}^G\left( {\hat t} \right)} \hfill \cr } where hfjTt^ h_{fj}^T\left( {\hat t} \right) and hfjGt^ h_{fj}^G\left( {\hat t} \right) are the target firm-specific human capital and the general human capital of a person j at the fth firm in calendar year t^ \hat t , ω is the weight on the general human capital and HNhf,t^ {\bf{H'}}{{\bf{N}}_h}\left( {f,\hat t} \right) is the level of human capital required for appointment to a managerial position, where H is a parameter vector and a vector Nhf,t^ {{\bf{N}}_h}\left( {f,\hat t} \right) summarizes the conditions that influence the required level of human capital of firm f in calendar year t^ \hat t .

From the definition of h^fjt^ {\hat h_{fj}}\left( {\hat t} \right) , we can find the parameters ϕo, ϕb, ϕe, and ϕo to describe the following human capital accumulation function: h^fjt^=ϕ0+ϕbτb,fjt^+ϕeτe,fj+ϕoτo,fj+ωχfj, {\hat h_{fj}}\left( {\hat t} \right) = {\phi _0} + {\phi _b}{\tau _{b,fj}}\left( {\hat t} \right) + {\phi _e}{\tau _{e,fj}} + {\phi _o}{\tau _{o,fj}} + \omega {\chi _{fj}}, where τb,fjt^ {\tau _{b,fj}}\left( {\hat t} \right) is the tenure as a board member of person j at firm f in calendar year t^ \hat t and τo,fj is the j th person's length of experience before joining firm f. Combining equations (15) and (16), when an employee j is appointed to an executive position, the following equality must be satisfied: τb,fjt^j=H˜Nhf,t^jϕ˜0ϕ˜eτe,fjϕ˜oτo,fjχ˜fj, {\tau _{b,fj}}\left( {\hat t_j^\star} \right) = {{\bf{\tilde H}}^{\prime}}{{\bf{N}}_h}\left( {f,\hat t_j^\star} \right) - {\tilde \phi _0} - {\tilde \phi _e}{\tau _{e,fj}} - {\tilde \phi _o}{\tau _{o,fj}} - {\tilde \chi _{fj}}, where H˜=Hϕb {\bf{\tilde H}} = {{\bf{H}} \over {{\phi _b}}} , ϕ˜0=ϕ0ϕb {\tilde \phi _0} = {{{\phi _0}} \over {{\phi _b}}} , ϕ˜e=ϕeϕb {\tilde \phi _e} = {{{\phi _e}} \over {{\phi _b}}} , ϕ˜o=ϕoϕb {\tilde \phi _o} = {{{\phi _o}} \over {{\phi _b}}} , χ˜fj=ωχfjϕb {\tilde \chi _{fj}} = {{\omega {\chi _{fj}}} \over {{\phi _b}}} , and t^j \hat t_j^\star is the calendar year when employee j is appointed to an executive officer position. This equation implies that, after controlling for several observable variables, τb,fjt^j {\tau _{b,fj}}\left( {\hat t_j^\star} \right) is negatively correlated with χ˜fj {\tilde \chi _{fj}} . This means that τb,fjt^j {\tau _{b,fj}}\left( {\hat t_j^\star} \right) potentially contains useful information about χ˜fj {\tilde \chi _{fj}} .

Let us define a function R such that: Rf,j,t^H˜Nhf,t^ϕ˜0ϕ˜eτe,fjϕ˜oτo,fjτb,fjt^. R\left( {f,j,\hat t} \right) \equiv {\bf{\tilde H}}'{{\bf{N}}_h}\left( {f,\hat t} \right) - {\tilde \phi _0} - {\tilde \phi _e}{\tau _{e,fj}} - {\tilde \phi _o}{\tau _{o,fj}} - {\tau _{b,fj}}\left( {\hat t} \right).

Given the estimates of H˜ {\bf{\tilde H}} , ϕ˜0 {\tilde \phi _0} , ϕ˜e {\tilde \phi _e} , and ϕ˜o {\tilde \phi _o} , we can estimate χ˜j {\tilde \chi _j} from the following: χ˜fj=Rf,j,t^j. {\tilde \chi _{fj}} = R\left( {f,j,\hat t_j^\star} \right).

Hence, we seek to obtain unbiased estimators of H˜ {\bf{\tilde H}} , ϕ˜0 {\tilde \phi _0} , ϕ˜e {\tilde \phi _e} , and ϕ˜o {\tilde \phi _o} . However, this is not easy because we can observe only a selected sample.

We define the deviation of the unobserved ability from the firm-level average in year t^ \hat t for the observed sample, εfjt^ {\varepsilon _{fj}}\left( {\hat t} \right) , as follows: εfjt^χ˜fjEχ˜fj|Ib=1,f,t^=Rf,j,t^jEχ˜fj|Ib=1,f,t^, {\varepsilon _{fj}}\left( {\hat t} \right) \equiv {\tilde \chi _{fj}} - E\left[ {{{\tilde \chi }_{fj}}|{I_b} = 1,f,\hat t} \right] = R\left( {f,j,\hat t_j^\star} \right) - E\left[ {{{\tilde \chi }_{fj}}|{I_b} = 1,f,\hat t} \right], where Ib is the dummy variable that equals one if and only if a person is a titled director and is included in our sample and, therefore, Eχ˜fj|Ib=1,f,t^ E\left[ {{{\tilde \chi }_{fj}}|{I_b} = 1,f,\hat t} \right] is the conditional expectation of χ˜fj {\tilde \chi _{fj}} , given all relevant information at the f th firm in year t^ \hat t for the observed sample. Then, equation (17) can be rewritten as follows: τb,fjt^j=H˜Nhf,t^jϕ˜0ϕ˜eτe,fjϕ˜oτo,fjEχ˜fj|Ib=1,f,t^εfjt^,t^. {\tau _{b,fj}}\left( {\hat t_j^\star} \right) = {\bf{\tilde H}}'{{\bf{N}}_h}\left( {f,\hat t_j^\star} \right) - {\tilde \phi _0} - {\tilde \phi _e}{\tau _{e,fj}} - {\tilde \phi _o}{\tau _{o,fj}} - E\left[ {{{\tilde \chi }_{fj}}|{I_b} = 1,f,\hat t} \right] - {\varepsilon _{fj}}\left( {\hat t} \right),\forall \hat t.

Note that Eεfjt^|Ib=1=0 E\left[ {{\varepsilon _{fj}}\left( {\hat t} \right)|{I_b} = 1} \right] = 0 by construction. Note also that the deviation of the unobserved ability from the conditional expectation of χ˜fj {\tilde \chi _{fj}} at the fth firm in year t^ \hat t for the observed sample, εfjt^ {\varepsilon _{fj}}\left( {\hat t} \right) , is uncorrelated with firm-level variables. Hence, we expect to satisfy an orthogonality condition that Eεfjt^Nf|Ib=1=0 E\left[ {{\varepsilon _{fj}}\left( {\hat t} \right){{\bf{N}}_f}|{I_b} = 1} \right] = 0 for the vector of firm-level variables Nf for any t^ \hat t .

Therefore, if we know a functional form of Eχ˜fj|Ib=1,f,t^ E\left[ {{{\tilde \chi }_{fj}}|{I_b} = 1,f,\hat t} \right] , we can obtain unbiased estimates of H˜ {\bf{\tilde H}} , ϕ˜0 {\tilde \phi _0} , ϕ˜e {\tilde \phi _e} , and ϕ˜o {\tilde \phi _o} from a generalized method of moments (GMM) estimation, using firm-level variables from the sample of titled directors as its instruments.

In order to implement this idea, we need to estimate Eχ˜fj|Ib=1,f,t^ E\left[ {{{\tilde \chi }_{fj}}|{I_b} = 1,f,\hat t} \right] . Because we use the sample of titled directors at target firms in the year when an M&A occurs, equation (15) implies that Ib = 1 if and only if h^fjt^fmHNhf,t^fm {\hat h_{fj}}\left( {\hat t_f^m} \right) \ge {\bf{H'}}{{\bf{N}}_h}\left( {f,\hat t_f^m} \right) , where t^fm \hat t_f^m is the calendar year in which the M&A occurs for target firm f. Using equation (16) and a function Rf,j,t^ R\left( {f,j,\hat t} \right) , h^fjt^fmHNhf,t^fm {\hat h_{fj}}\left( {\hat t_f^m} \right) \ge {\bf{H'}}{{\bf{N}}_h}\left( {f,\hat t_f^m} \right) can be rewritten as a condition for χ˜fj {\tilde \chi _{fj}} as follows: Ib=1iffχ˜fjRf,j,t^fm, {I_b} = 1\;{\rm{iff}}\;{\tilde \chi _{fj}} \ge R\left( {f,j,\hat t_f^m} \right), for any f and j. This means that the observed sample is considered to be drawn from an upper-tailed distribution of χ˜fj {\tilde \chi _{fj}} and that: Eχ˜fj|Ib=1,f,t^fm=Eχ˜fj|χ˜fjRf,j,t^fm,f,t^fm. E\left[ {{{\tilde \chi }_{fj}}|{I_b} = 1,f,\hat t_f^m} \right] = E\left[ {{{\tilde \chi }_{fj}}|{{\tilde \chi }_{fj}} \ge R\left( {f,j,\hat t_f^m} \right),f,\hat t_f^m} \right].

We need to estimate Eχ˜fj|χ˜fjRf,j,t^fm,f,t^fm E\left[ {{{\tilde \chi }_{fj}}|{{\tilde \chi }_{fj}} \ge R\left( {f,j,\hat t_f^m} \right),f,\hat t_f^m} \right] to obtain unbiased estimators of the parameters. The standard two-stage estimator in Heckman (1979) uses a random sample from the population to estimate selection equations, the parameters of which are used to construct an inverse Mills ratio to estimate Eχ˜fj|χ˜fjRf,j,t^fm,f,t^fm E\left[ {{{\tilde \chi }_{fj}}|{{\tilde \chi }_{fj}} \ge R\left( {f,j,\hat t_f^m} \right),f,\hat t_f^m} \right] , by assuming that χ˜fj {\tilde \chi _{fj}} is drawn from a normal distribution. Typically, exclusive variables are needed for the estimation of selection equations to obtain reliable estimates.

Unfortunately, a random sample from the population is not available in our case. However, note that our conditional expectation of unobserved ability, Eχ˜fj|χ˜fjRf,j,t^fm,f,t^fm E\left[ {{{\tilde \chi }_{fj}}|{{\tilde \chi }_{fj}} \ge R\left( {f,j,\hat t_f^m} \right),f,\hat t_f^m} \right] , is a function of Rf,j,t^fm R\left( {f,j,\hat t_f^m} \right) , which is determined by the parameters of our structural equation (18), H˜ {\bf{\tilde H}} , ϕ˜0 {\tilde \phi _0} , ϕ˜e {\tilde \phi _e} , and ϕ˜o {\tilde \phi _o} . Therefore, it might be possible to jointly estimate the parameters on the conditional expectation of unobserved ability with equation (18) using the GMM. This is possible because, although we have only a selected sample, the timing of selection is different across individuals, which gives us information about selection equations. In addition, the timing of observations, t^fm \hat t_f^m , is generally different from the timing of promotions, t^j \hat t_j^\star . Therefore, we can find several exclusive variables that are needed to identify parameters and to obtain reliable estimates.

The remaining concern is determining what is a plausible distribution of χ˜fj {\tilde \chi _{fj}} . Gabaix and Landier (2008) argue that an extreme value theory can provide a nice approximation for the upper tail of a large class of continuous distributions, including uniform, Gaussian, exponential, lognormal, Weibull, Gumbel, Fréchet, Pareto, stretched exponential, and log-gamma distributions. More specifically, suppose that χ˜ \tilde \chi is drawn from a distribution function Fχ˜ F\left( {\tilde \chi } \right) , where Fχ˜ F'\left( {\tilde \chi } \right) is differentiable in a neighborhood of the upper bound of its support χ¯R{+} \bar \chi \in R \cup \left\{ { + \infty } \right\} , and there exists ξ=limχ˜χ¯ddχ˜1Fχ¯Fχ˜ \xi = {\lim _{\tilde \chi \to \bar \chi }}{d \over {d\tilde \chi }}{{1 - F\left( {\tilde \chi } \right)} \over {F'\left( {\tilde \chi } \right)}} and ξ < ∞. We define F¯χ˜=1Fχ˜ \bar F\left( {\tilde \chi } \right) = 1 - F\left( {\tilde \chi } \right) and Qx=F¯1x Q\left( x \right) = {\bar F^{ - 1}}\left( x \right) . Gabaix and Landier (2008) apply an extreme value theorem and show that there exists a χ0 and ζ such that: Qxχ0xζ1, Q'\left( x \right) \approx - {\chi _0}{x^{\zeta - 1}},

Our appendix proves the following lemma.

Lemma 8

Suppose that Qx=χ0xζ1 Q'\left( x \right) = - {\chi _0}{x^{\zeta - 1}} , where ζ ≥ 0, and that there exists a Q0=limε0Qε+χ0εζζ Q\left( 0 \right) = {\lim _{\varepsilon \to 0}}\left[ {Q\left( \varepsilon \right) + {{{\chi _0}{\varepsilon ^\zeta }} \over \zeta }} \right] and Q(0) < ∞. Then, for any small xp, we have: Eχ|xxp=ζQ0+Qxpζ+1. E\left[ {\chi |x \le {x_p}} \right] = {{\zeta Q\left( 0 \right) + Q\left( {{x_p}} \right)} \over {\zeta + 1}}.

The assumption in Lemma 8 requires that the ability distribution has a finite upper bound, Q(0). Gabaix and Landier (2008) provide empirical evidence that supports this assumption using data on U.S. compensation of CEOs.

We assume that a function Q may differ across firms f and across years t^ \hat t , Qx:f,t^ Q\left( {x:f,\hat t} \right) , and that Q0:f,t^fm=ΨNχf,t^fm Q\left( {0:f,\hat t_f^m} \right) = {\boldsymbol \Psi} '{{\bf{N}}_\chi }\left( {f,\hat t_f^m} \right) , where Nχf,t^ {{\bf{N}}_\chi }\left( {f,\hat t} \right) is a vector of variables that influence a potential upper bound of talent in firm f in calendar year t^ \hat t . Let us choose xp so that xp=F¯Rf,j,t^fm:f,t^fm {x_p} = \bar F\left( {R\left( {f,j,\hat t_f^m} \right):f,\hat t_f^m} \right) . Then, xxp=χ˜fjRf,j,t^fm \left\{ {x \le {x_p}} \right\} = \left\{ {{{\tilde \chi }_{fj}} \ge R\left( {f,j,\hat t_f^m} \right)} \right\} and Qxp:f,t^fm=Rf,j,t^fm Q\left( {{x_p}:f,\hat t_f^m} \right) = R\left( {f,j,\hat t_f^m} \right) .Therefore, we can estimate the conditional expectation of ability in each firm Eχ˜fj|χ˜fjRf,j,t^fm:f,t^fm E\left[ {{{\tilde \chi }_{fj}}|{{\tilde \chi }_{fj}} \ge R\left( {f,j,\hat t_f^m} \right):f,\hat t_f^m} \right] by: Eχ˜fj|χ˜fjRf,j,t^fm:f,t^fm=ζΨNχf,t^fm+Rf,j,t^fmζ+1. E\left[ {{{\tilde \chi }_{fj}}|{{\tilde \chi }_{fj}} \ge R\left( {f,j,\hat t_f^m} \right):f,\hat t_f^m} \right] = {{\zeta {\boldsymbol \Psi} '{{\bf{N}}_\chi }\left( {f,\hat t_f^m} \right) + R\left( {f,j,\hat t_f^m} \right)} \over {\zeta + 1}}.

In sum, we can obtain unbiased estimates of H˜ {\bf{\tilde H}} , ϕ˜0 {\tilde \phi _0} , ϕ˜e {\tilde \phi _e} , and ϕ˜o {\tilde \phi _o} using the GMM based on the following orthogonality condition: 0=Eεfjt^fmNf|Ib=1, 0 = E\left[ {{\varepsilon _{fj}}\left( {\hat t_f^m} \right){{\bf{N}}_f}|{I_b} = 1} \right], εfjt^fm=Rf,j,t^jζΨNχf,t^fm+Rf,j,t^fmζ+1, {\varepsilon _{fj}}\left( {\hat t_f^m} \right) = R\left( {f,j,\hat t_j^\star} \right) - {{\zeta {\boldsymbol \Psi} '{{\bf{N}}_\chi }\left( {f,\hat t_f^m} \right) + R\left( {f,j,\hat t_f^m} \right)} \over {\zeta + 1}}, Rf,j,t^H˜Nhf,t^ϕ˜0ϕ˜eτe,fjϕ˜oτo,fjτb,fjt^, R\left( {f,j,\hat t} \right) \equiv {\bf{\tilde H}}'{{\bf{N}}_h}\left( {f,\hat t} \right) - {\tilde \phi _0} - {\tilde \phi _e}{\tau _{e,fj}} - {\tilde \phi _o}{\tau _{o,fj}} - {\tau _{b,fj}}\left( {\hat t} \right), where Nf=Nhf,t^j,Nhf,t^fm,Nχf,t^fm,τft^fm,τft^j {{\bf{N}}_f} = \left( {{{\bf{N}}_h}\left( {f,\hat t_j^\star} \right),{{\bf{N}}_h}\left( {f,\hat t_f^m} \right),{{\bf{N}}_\chi }\left( {f,\hat t_f^m} \right),{{\boldsymbol {\tau }}_f}\left( {\hat t_f^m} \right),{{\boldsymbol {\tau }}_f}\left( {\hat t_j^\star} \right)} \right) , where τft^ {{\boldsymbol {\tau }}_f}\left( {\hat t} \right) is the vector of the firm-level averages of τb,fjt^ {\tau _{b,fj}}\left( {\hat t} \right) , τe,fj, and τo,fj in year t^ \hat t . Using the estimated parameters, H˜ {\bf{\tilde H}} , ϕ˜0 {\tilde \phi _0} , ϕ˜e {\tilde \phi _e} , and ϕ˜o {\tilde \phi _o} , we can estimate χ˜fj {\tilde \chi _{fj}} by: χ˜fj=Rf,j,t^j. {\tilde \chi _{fj}} = R\left( {f,j,\hat t_j^\star} \right).

Using χ˜fj {\tilde \chi _{fj}} , we can estimate: qfjt=B^t:XZ,feλΔπfjIt=0+Πfj {q_{fjt}} = \hat B\left( {t:{{\bf{X}}_{Z,f}}} \right){e^{ - \lambda \left[ {\Delta {\pi _{fj}}I\left( {t = 0} \right) + {\Pi _{fj}}} \right]}} Πfj=α˜bτb,fjt^fm+α˜eτe,fj+α˜DXD,f,j+α˜FXF,f+αχ+χ˜fj, {\Pi _{fj}} = {\tilde \alpha _b}{\tau _{b,fj}}\left( {\hat t_f^m} \right) + {\tilde \alpha _e}{\tau _{e,fj}} + \tilde \alpha _D^{\prime}{{\bf{X}}_{D,f,j}} + \tilde \alpha _F^{\prime}{{\bf{X}}_{F,f}} + \alpha _\chi ^ + {\tilde \chi _{fj}}, Δπfj=α¯bτb,fjt^fm+α¯eτe,fj+α¯DXD,f,j+α¯FXF,f+α¯χ+χ˜fj, \Delta {\pi _{fj}} = {\bar \alpha _b}{\tau _{b,fj}}\left( {\hat t_f^m} \right) + {\bar \alpha _e}{\tau _{e,fj}} + \bar \alpha _D^{\prime}{{\bf{X}}_{D,f,j}} + \bar \alpha _F^{\prime}{{\bf{X}}_{F,f}} + \bar \alpha _\chi ^ + {\tilde \chi _{fj}}, where α˜χ+=ϕbωα˜χ \tilde \alpha _{^\chi }^ + = {{{\phi _b}} \over \omega }{\tilde \alpha _\chi } and α¯χ+=ϕbωα¯χ \bar \alpha _{^\chi }^ + = {{{\phi _b}} \over \omega }{\bar \alpha _\chi } .

Data
Firm-level data

Below, we test the propositions outlined in the previous sections, focusing on M&As during the period 1990–2006 in Japan. We identify Japanese M&As in this period from the Delisting dataset, which is manually constructed from various sources of information, including Kaisha Nenkan 1969–2006, Kaisha Shikiho 2000–2006, and Tosho Yoran 1972–1973 and 1975–2007. This dataset contains information on delisted firms in exchange markets throughout Japan from 1968 to 2007. The information includes the stock code and name of the delisted firm, the listed market, dates of listing and delisting, the reason for delisting, and the name and the stock code of the new firm, when the reason given for delisting was a merger or an acquisition.

From the database, we selected those firms that had been delisted because of a “merger” or a “full-ownership acquisition” during the sample period

Other reasons given for delisting include “stock transfer”, “bankruptcy”, “insolvency”, “window-dressing settlement”, and “business suspension”. There was a single case of stock transfer in our sample, but we excluded it because the target firm was in the JASDAQ market.

. When the firms that were delisted as the result of M&As involve a consolidation of assets and liabilities under a company with a stock code, we can observe who is retained in the new firm after the M&A as a board member. We select these types of delisted firms, and refer to them as “target firms”. In most M&As, one of the merging firms survives as the same firm with the same stock code. In other cases, mergers result in a new company with a new stock code. Our dataset includes both cases, and we refer to the post-merger firms in both cases as “new firms” in this paper. When we discuss surviving firms with the same stock codes after M&As, we refer to them as “acquirers” and we refer to all premerger firms that transacted with target firms during M&As as “other merged firms”.

Our sample contains 344 M&A cases, comprising 123 mergers and 221 full-ownership acquisitions. Only mergers between listed firms are included. Figure 1 depicts the number of sample M&As over the study period. We can see that the number of M&As rapidly increased after 1998.

Figure 1

Number of M&As each year during the period 1990–2006.

The exchanges on which the target firms are listed include the Tokyo, Osaka, Nagoya, and other local stock exchanges, as well as emerging and over-the-counter markets. However, we exclude cases in the JASDAQ market because director-level data for these firms are available only after 2000.

We merge data on firm characteristics for the target firms and the new firms with the above M&A data. The data on firm characteristics are from the Nikkei NEEDS database. We first merge the three-digit Nikkei industry code for each target and new firm.

Table 1 presents the numbers of target and new firms in our sample, based on a two-digit industry classification

We exclude financial institutions such as banks, stock companies, and insurance companies because of the difference in accounting systems.

. The second column shows the number of target firms by industry, whereas the third column shows the number of new firms by industry. The fourth column shows the number of M&A cases where the target and new firms are from the same industry. We can see that M&As within the same industry occurred most often in the electronics industry.

Numbers of mergers and acquisitions by industry

Industry No. of target firms No. of new firms Both
Food 15 16 14
Textile 7 8 5
Pulp 13 13 10
Chemical 19 21 14
Pharmaceutical 6 3 2
Petroleum 3 6 3
Rubber 2 1 1
Ceramic 14 13 10
Iron ore 7 13 7
Nonferrous metal and metal 15 18 10
Machinery 24 24 16
Electronics 27 46 24
Shipbuilding 0 1 0
Automobile 9 9 4
Transport machinery 6 0 0
Precision instruments 2 2 1
Other manufacturers 10 9 6
Marine products 1 0 1
Mining 4 2 2
Construction 26 26 16
Trade 43 32 22
Retail 27 31 21
Other financial businesses 2 4 2
Real estate 11 1 1
Rail and bus 2 12 2
Land transportation 2 4 2
Marine transportation 7 6 6
Air transportation 1 3 1
Telecommunications 4 2 2
Electricity 0 2 0
Service 35 15 13

Total 344 344 218

Next, we merge the financial characteristics in the Nikkei NEEDS database with our sample. More specifically, we merge operating income, sales, number of full-time employees, personnel expenses, total assets, and the stock share of the top ten stockholders for each target firm. These variables are explained in detail below. We also merge operating income, total assets, and the stock share of the top ten stockholders of the acquirer of the target firm in the year prior to the M&A if there is a surviving firm with the same stock code. For cases of M&As without any surviving firm, we use the average value of these variables for all other merged firms instead to represent those of the acquirer.

Personnel data

Next, we merge the firm-level data with the director-level data. The data for board members are taken from the Directors data published by Toyo Keizai. This database contains information on the directors of all listed firms in Japan from 1990 to 2007. The information includes the title, the date the person entered the firm, and a personal history of the board member, along with the name and date of birth.

Because we are interested in the retention of Japanese executives after M&As, we focus our analysis on directors with titles in the sample target firms

Auditors are excluded.

. Therefore, our sample comprises all directors with titles at target firms in the year of the M&A

More precisely, titled directors who served in their positions until a year before the M&A are also included in our sample because of a gap between the M&A date and the recording date for the Directors data. We assume that titled target directors who resigned their positions more than a year before the M&A did so for reasons other than the M&A, and we exclude them from our sample.

. As explained in footnote 3, following the arguments by Kaplan (1994) and Saito and Odagiri (2008), we consider titled directors to be the top executives in Japanese companies. In the following sections, we analyze whether executives in target firms are retained as board members in the new firms after M&As.

We have 1520 observations (titled directors) for 343 targets, with an average of 4.43 titled directors per target. We specify the date of birth, title, the date a person entered the firm, and her/his history as a board member. We also identify the number of other firms (including the target firm itself) in which target-firm managers served as board members in the year of the M&A.

From the Directors data, we determine whether a target manager was retained as a board member (with or without a title) in the new firm after the M&A and, if so, how many years he/she stayed on the board after the M&A

We consider the target manager to have been retained even if he/she becomes a director with a different title in the new firm or if the manager loses his/her title in the new firm. In addition, in the cases where an other merged firm of an M&A survives as the new firm and the manager of the target firm was also a director in this other merged firm and continues to be a director in the new firm, we consider this manager to have been retained.

. In addition, we identify whether a target-firm manager served as a board member of the acquirer before the M&A.

Table 2 provides summary statistics of the managers’ retention characteristics. The first variable (retained) takes a value of one if the target manager became a board member of the new firm after the M&A and is zero otherwise. The second variable (years of survival if retained) is the number of years that the retained target manager served as a board member of the new firm after the M&A. The table shows that only 38.7% of the target managers were retained as board members after M&As and that the retained managers kept their positions for an average of less than five years.

Characteristics of target directors

Variable No. of observations Mean S.D. Min. Max.
Retained 1520 0.387 0.487 0 1
Years of survival if retained 588 4.117 2.190 1 16
Empirical specification
GMM estimation

Using the dataset explained above, we first conduct a GMM estimation using equations (19), (20), and (21) to obtain the unobserved individual ability defined in equation (22).

The summary statistics of the variables used in the GMM estimation can be found in Table 3. The first four variables are those of the manager's tenure history, which determine his/her human capital at t^j \hat t_j^\star and t^fm \hat t_f^m , as in equation (16). Tenure as a board member, τb,fjt^ {\tau _{b,fj}}\left( {\hat t} \right) is the manager's length of tenure after he/she became a board member in the target firm. We calculate tenure as a board member in the years t^j \hat t_j^\star and t^fm \hat t_f^m from the target manager's history as a board member in the target firm. Tenure as an employee, τe,fj, is the manager's length of tenure as an employee in the target firm, which is calculated by determining the tenure before he/she became a board member, and outside experience, τo,fj, is calculated from the manager's age (calculated from the date of birth) minus his/her entire period of tenure in the target firm.

Summary statistics

Variable No. Mean S.D. Min. Max.
Tenure as a board member at t^j \hat t_j^\star (years) 1520 2.70 2.91 0.00 20.92
Tenure as a board member at t^fm \hat t_f^m (years) 1520 8.50 7.41 0.25 56.42
Tenure as an employee (years) 1520 11.82 13.59 0.00 43.17
Outside experience (years) 1520 40.35 15.62 15.17 71.17
Age (years) 1520 60.67 5.54 35.08 90.92
Variables in Nhf,t^ {{\boldsymbol {N}}_h}\left( {f,\hat t} \right) , for t^=t^j,t^fm \hat t = \hat t_j^\star,\hat t_f^m
  Negative operating income at t^j \hat t_j^\star 1520 0.15 0.36 0.00 1.00
  Negative operating income at t^fm \hat t_f^m 1520 0.21 0.41 0.00 1.00
  Direct 1520 0.20 0.40 0.00 1.00
  No. of employees at t^j \hat t_j^\star (1000) 1520 1.70 2.65 0.006 23.87
  No. of employees at t^fm \hat t_f^m (1000) 1520 1.53 2.24 0.006 16.35
  Wage at t^j \hat t_j^\star (million yen) 1520 4.25 3.37 0.08 28.39
  Wage at t^fm \hat t_f^m (million yen) 1519 5.03 3.89 0.21 25.57
  Median tenure as a titled director at t^j \hat t_j^\star (years) 1520 4.89 4.00 0 33.17
  Median tenure as a titled director at t^fm \hat t_f^m (years) 1520 4.61 3.19 0.50 24.67
  Median age of titled directors at t^fm \hat t_f^m (years old) 1520 55.35 4.34 27.58 69.17
Variables in Nχf,t^fm {{\boldsymbol {N}}_\chi }\left( {f,\hat t_f^m} \right)
  Sales (1000 million yen) 1520 146.78 256.95 0.68 2877.40
τft^fm {{\boldsymbol {\tau }}_f}\left( {\hat t_f^m} \right) , τft^j {{\boldsymbol {\tau }}_f}\left( {\hat t_j^\star} \right) (instrumental variables)
  Median value of tenure as a board member at t^fm \hat t_f^m (years) 1520 7.48 4.17 0.25 32.42
  Median value of tenure as an employee (years) 1520 10.84 12.17 0.00 35.83
  Median value of outside experience (years) 1520 40.20 13.66 20.42 68.42
Manager characteristics
  No. of firms as a board member 1520 1.14 0.51 1.00 10.00
  Board member in the acquirer before the M&A 1520 0.15 0.35 0.00 1.00
  Upper 1520 0.31 0.46 0.00 1.00
  Target ROA 1520 0.00 0.06 −0.43 0.52
Firm characteristics
  Log target assets 1520 11.07 1.35 6.92 14.61
  Board size 1520 5.77 2.73 1.00 15.00
  Log stock share of top 10 1520 −6.89 1.35 −8.39 −0.14
  Target firm's relative size in assets 1520 0.27 0.23 0.00 0.99
  Related 1520 0.48 0.50 0.00 1.00
  Acquisition 1520 0.61 0.49 0.00 1.00
  Acquirer or others ROA 1520 0.00 0.06 −0.68 0.60
  Log top 10 share of acquirer or others 1520 −7.09 1.56 −8.89 0.00

For the variables that determine the human capital level required for appointment to a managerial position (a titled director position), Nhf,t^ {{\bf{N}}_h}\left( {f,\hat t} \right) , we use Negative operating income, Direct, No. of employees, Wage, Median tenure as a titled director, and Median age at t^fm \hat t_f^m . The summary statistics of these variables in the years t^j \hat t_j^\star and t^fm \hat t_f^m are shown in rows 7 to 16 in the table

More specifically, we calculate the values of these variables immediately before the years t^j \hat t_j^\star and t^fm \hat t_f^m , except for Direct, which takes the same value for these two years. For median tenure as a titled director, we calculate the median tenure of all the titled directors of firm f, excluding that of the target manager. The tenure of each target manager is calculated on a monthly basis and then expressed in years. When we use the variables from Nikkei NEEDS, we use the published data for the latest fiscal year to t^j \hat t_j^\star . In some cases, especially for variables at t^j \hat t_j^\star , the financial data are not available. In these cases, we use the data for the year nearest to t^j \hat t_j^\star

. Negative operating income is an indicator variable that takes a value of one if the operating income of the target firm is negative and is zero otherwise. Kaplan (1994) shows that, if the operating income is negative, then the current managers are likely to be dismissed. Because a firm must replace managers with new appointees for an exogenous reason in this case, the required level of human capital would be lower than is usually the case for appointments. Therefore, we expect a negative impact from negative operating income on the required level of human capital.

Direct is an indicator variable that takes a value of one if the manager is hired from outside the firm. We control for this variable because the required human capital level may differ depending on whether the manager is selected from among the employees of the firm or is hired from outside. We control for the number of employees to estimate the required human capital level because more employees may mean there is more intense competition to become a manager on the board. Therefore, we expect that the number of employees will have a positive effect on the required human capital level. Similarly, a higher average wage may imply that employees of the firm have a higher average human capital level. Therefore, a greater level of human capital may be required for appointment to a titled director position. The average wage is calculated using the number of employees and the personnel expenses of the target firm.

We include the median length of tenure as a titled director within a firm and consider the turnover rate of titled directors to be low if the tenure length as a titled director is high, on average. In such cases, it is more difficult to become a titled director. We also control the median value of age in the merger year. If a firm has the high median age, it is likely that the firm set a high requirement to become a titled director. Therefore, we expect its coefficient to be positive.

For a variable in a vector Nχf,t^fm {N_\chi }\left( {f,\hat t_f^m} \right) , we assume that the potential highest level of ability in a target firm depends on the firm's size. We use the sales volume of the target firm as a proxy for the target firm's size. The summary statistics for the sales volume are shown in row 18 of the table. The next three variables in Table 3 represent the instrumental variables for the endogenous variables in the model, that is, the instrumental variables for tenure as a board member at t^fm \hat t_f^m , tenure as an employee, and outside experience. We use the target firm's median value at t^fm \hat t_f^m for these variables as instruments.

Survival analysis

Once we obtain the unobserved individual ability, χ^fj {\hat \chi _{fj}} , from the GMM estimation, we analyze the determinants of the retention of target-firm managers after M&As. We utilize the stratified Cox proportional hazards model in equations (23), (24), and (25), which requires us to specify the vector XD,f,j of target-firm manager characteristics other than tenure at the time of the M&A, the vector XF,f that influences the productivity of managers, which includes the target-firm characteristics and the transaction characteristics suggested by Wulf and Singh (2011)

Wulf and Singh (2011) investigate the determinants of target CEO retention rates. Our analysis differs in that we include other directors with a title. As for the determinants of CEO retention rates, Wulf and Singh (2011) consider target-firm characteristics, CEO characteristics, transaction characteristics, and acquirer governance characteristics. They also include variables such as target-firm CEO compensation and the number of days as CEO until completion of the merger. Unfortunately, these variables are not available for our chosen context. However, we are able to include more detailed director characteristics because of the richness of our personnel data.

, and the vector XZ,f, which contains any variables that influence the baseline hazard.

For the variables in the vector XD,f,j, we use the manager's age at the time of the M&A (Age), the number of firms in which the target manager served as a board member (No. of firms as a board member), whether the target manager was a board member in the acquirer before the M&A (Board member in the acquirer before the M&A), and whether the target manager was the president or chairman of the board of directors (Upper). These variables are calculated from the Directors data, as explained in the previous section. We include the target firm's performance as a manager characteristic to measure the manager's ability. To represent the target firm's performance, we use the return on assets (ROA) in the year prior to the M&A, where the ROA is defined as operating income divided by assets, measured using the deviation from the industry median. Note that this measure is at the firm level, even though we consider it as a manager characteristic, as suggested by Wulf and Singh (2011).

The target firm's characteristics in XF,f include firm size and the board size of target firms. To control for the target firm's size and its board size, we include the logarithm of the target firm's assets in the year prior to the merger (Log target assets) and the number of board members (Board size). The previous literature has considered the target firm's size to be important because larger and more complex firms may require managers with unique skills, which may increase the retention rate of managers (Finkelstein and Hambrick, 1989)

The models regarding span of control (e.g., Lucas, 1978; Rosen, 1982) predict that a large firm should be operated by a talented person. Recent assignment theories of managerial compensation (e.g., Terviö, 2007; Gabaix and Landier, 2008) make this assumption and find that this model can quantitatively explain the relation between CEO compensation and firm size. If this prediction is correct, the size of a firm can also be considered as a manager characteristic.

. We include the board size of the target firm because an increase in the number of managers may reduce the productivity of managers in a target firm. The board size of the target company is also important for another reason: between 1990 and 2006, many companies reduced the size of their boards. If we do not control for board size, the coefficient on tenure may capture this effect as well. Our theory is based on the presumption that there are some explicit or implicit transfers that have to be negotiated after the M&A. Although we believe that this is a plausible assumption, some may argue that this may not be true. If so, the initial contracts resulting from the ex ante strategic bargaining during the M&A can influence the separation probability. To estimate the importance of ex ante strategic bargaining, Bargeron et al. (2009) control for a measure of the ownership structure, specifically, the level of insider ownership in a target firm. Following this idea, we include the logarithm of the stock share of the top 10 stock holders (log stock share of top 10) in XF,f, so that we can minimize the bias even if our assumptions are not valid. We expect that the impact of strategic retention would be smaller if the value of this variable were larger because large shareholders have more incentive to monitor managers under a concentrated ownership structure.

The transaction characteristics in XF,f include the relative size of the target firm. If the firm is large, the postmerger integration may be more difficult and, therefore, we would expect the likelihood of manager retention to be higher (Zollo and Singh, 2004). To measure relative firm size, we calculate the ratio of the target firm assets to the acquirer assets if there is a surviving firm with the same stock code. For cases of M&As without any surviving firm, we use the total assets of all other merged firms in place of the acquirer assets. Following the previous literature, including Walsh (1989) and Buchholtz et al. (2003), we include a dummy variable that represents whether the target and the new firm operate in the same industry as a transaction characteristic. More specifically, we create an indicator variable that is equal to one if the new and target firms operate in the same three-digit Nikkei industry code (i.e., medium industry classification) and is zero otherwise (Related). We also include a full-ownership acquisition dummy to observe whether mergers and full-ownership acquisitions have different effects on the retention rate (Acquisition).

We control for the governance characteristics of the other merged firms. Specifically, we control for the ROA and the stock share of the top 10 stock holders of the acquirer if there is a surviving firm with the same stock code. For cases of M&As without any surviving firm, we use the average value of these variables for all other merged firms in place of the acquirer values (“Acquirer or others ROA” and “Log top 10 share of acquirer or others”).

The summary statistics of these variables, which appear exclusively in the stratified Cox proportional hazards estimation, are shown in the last two groups in Table 3. In addition to these variables, we need tenure as a board member at t^fm \hat t_f^m , tenure as an employee, and age, for which summary statistics are shown in the first group of the table. Tenure and age variables are calculated as explained above.

In addition, for the vector Xz,f in equation (23), we include two-digit industry codes for a new firm and a target firm and the year of the M&A. This means that we allow the baseline hazard functions to differ for these groups. Note that because hazard rate depends on year after M&A, controlling the year of the M&A helps avoiding the impact of economic conditions. Finally, we construct an initial dummy that assigns a value of one for the initial year (t = 0) and zero for any other separation time

We expand our data in order to include this time-varying covariate. More specifically, each observation of titled directors who are retained in the new firms is expanded to two observations, one for the initial year and the other for the following years. Therefore, we have 2018 observations in total, as seen in our later estimation of the Cox proportional hazards model, comprising 588 observations with an initial dummy value of zero and 1520 observations with an initial dummy value of one.

.

Results

Table 4 shows the results of our GMM estimation using equations (19), (20), and (21). In addition to the variables explained in the previous section, we include two-digit industry dummies and decade dummies in Nhf,t^j {{\bf{N}}_h}\left( {f,\hat t_j^\star} \right) and Nhf,t^fm {{\bf{N}}_h}\left( {f,\hat t_f^m} \right) . We select a group of exogenous variables so that the estimation passes the over-identification test and use them as instruments, along with the instruments for the endogenous variables. The instrument variables we used are listed under Table 4.

GMM estimation

Variable Coef. Std. Err.
Negative operating income −0.636 0.404
Direct −1.964 1.030 *
No. of employees 0.000 0.000
Wage 0.242 0.049 ***
Median tenure as a titled director 0.025 0.028
Median age 0.801 0.253 ***
Constant ϕ˜0 \left( {{{\tilde \phi }_0}} \right) 73.183 13.474 ***
Tenure as an employee (ϕe) 0.400 0.230 *
Outside experience ϕ˜o \left( {{{\tilde \phi }_o}} \right) 0.382 0.224 *
1/(ζ + 1) 0.663 0.051 ***
Sales 0.000 0.000 *
Industry code dummy Yes
Decade dummy Yes

Number of observations 1519

1. Instruments for the equation: negative operating income, direct no. of employees, wage, median tenure as a titled director, median age as a titled director, median tenure as a board member, median tenure as an employee, median outside experience, sales, and decade and industry dummy for both t^j \hat t_j^\star and t^fm \hat t_f^m

2. ***, ** and * denote statistical significance at the 1%, 5%, and 10% significance levels. Standard errors are calculated using bootstrap methods.

We can see in Table 4 that the signs of the coefficients of the variables in the R function in equation (20) are all as expected. That is, a higher number of employees, higher wages, higher median tenure as a titled director, and higher median age seem to make it more difficult for a director to be appointed as a titled director, whereas poor performance of the current management makes it easier. The coefficient of Direct is negative, indicating that the human capital level required to be appointed to a titled director position is lower for managers from outside the firm. The coefficients for Direct, Wage, and Median age are statistically significant.

The results show that the coefficients for both tenure as an employee and outside experience are positive, implying that these variables raise individuals’ human capital. However, because the values of these coefficients are smaller than one, their effect on human capital is smaller than that of tenure as a board member, i.e., experience as a board member seems to be more important for promotion to a management position than is experience as an employee. We estimate 1/(ζ + 1) and obtain a value such that 0 < 1/(ζ + 1) < 1 without imposing any restrictions on estimation. This is consistent with the assumption of ζ > 0 in Lemma 8. The coefficients of the variables that affect the manager's human capital are all statistically significant. The coefficient of sales that appears in Nχf,t^fm {{{\bf{N}}_\chi }\left( {f,\hat t_f^m} \right) is positive and significant, implying that, as the company size grows, the potential highest level of ability increases.

Using the estimated parameters, we obtain the unobserved ability, χ˜fj {\tilde \chi _{fj}} for each manager in our sample from equation (22). Table 5 shows the summary statistics for individual unobserved ability, whereas Figure 2 represents the kernel density estimate of the estimated unobserved ability.

Estimated unobserved skill

Obs. Mean Std. Dev. Min. Max.
χ˜fj {\tilde \chi _{fj}} 1520 −6.606 5.104 −44.191 5.654

Figure 2

Kernel density of estimated unobserved skill.

Tables 6 and 7 show the results of our stratified Cox proportional hazards model estimation using equations (23), (24), and (25). All results pass the test of the proportional hazards assumption, based on the test of nonzero slope in a generalized linear regression of the scaled Schoenfeld residuals on a natural log of the analysis time at the 5% level. For equations (1) and (3), the standard errors are obtained using 1000 bootstrap samples.

Stratified Cox proportional hazards model estimation

Variable (1) (2)


Coef. Std. Err. Coef. Std. Err.
1: Tenure as a board member 0.006 0.024 −0.010 0.009
2: Tenure as an employee 0.011 0.007 * 0.011 0.005 **
3: Tenure as a board member × initial dummy −0.012 0.024 0.002 0.010
4: Tenure as an employee × initial dummy −0.014 0.007 ** −0.014 0.005 ***

Age 0.112 0.031 *** 0.100 0.016 ***
Board size 0.097 0.081 0.105 0.049 **
No. of firms as a board member −0.166 0.215 −0.179 0.137
Board member in the acquirer before the M&A −0.106 0.479 −0.108 0.308
Acquisition −0.025 0.797 0.003 0.495
Target ROA −11.913 7.626 −11.335 3.860 ***
Related 0.117 0.562 0.034 0.374
Target firm's relative size in assets −0.990 0.530* −0.942 0.433 **
Log target assets −0.026 0.216 −0.066 0.133
Upper −0.277 0.198 −0.264 0.132 **
Log stock share of top 10 −0.460 0.355 −0.455 0.194 **
Acquirer or others ROA −3.164 6.143 −2.214 3.368
Log top 10 share of acquirer or others 0.134 0.265 0.092 0.162
χ˜fj {\tilde \chi _{fj}} 0.044 0.039

Age × I(t = 0) −0.072 0.031 *** −0.061 0.017 ***
Board size × I(t = 0) −0.073 0.082 −0.080 0.051
No. of firms as a board member × I(t = 0) −0.293 0.248 −0.281 0.184
Board member in the acquirer before the M&A × I(t = 0) −0.114 0.482 −0.116 0.312
Acquisition × I(t = 0) 0.729 0.819 0.702 0.525
Target ROA × I(t = 0) 11.813 7.691 11.198 3.910 ***
Related × I(t = 0) −0.383 0.577 −0.300 0.388
Target firm's relative size in assets × I(t = 0) −0.210 0.567 −0.257 0.464
Log target assets × I(t = 0) 0.012 0.220 0.054 0.138
Upper × I(t = 0) −0.007 0.207 −0.017 0.143
Log stock share of top 10 × I(t = 0) 0.504 0.357 0.503 0.200 **
Acquirer or others ROA × I(t = 0) 5.533 6.241 4.636 3.577
Log top 10 share of acquirer or others × I(t = 0) −0.161 0.269 −0.120 0.165
χ˜fj×It=0 {\tilde \chi _{fj}} \times I\left( {t = 0} \right) −0.037 0.039

Wald Test Coef. Chi2(1) Coef. Chi2(1)

1 + 3 −0.006 1.28 −0.008 5.31 **
2 + 4 −0.003 2.71 * −0.003 3.74 *

Number of observations 2108 2108

Stratified by year, acquirer industry code, and target industry code.

***, **, * denote statistical significance at the 1%,5%, and 10% significance levels.

Standard errors are calculated by bootstrap methods.

Stratified Cox proportional hazards model estimation

Variable (1) (1)


Coef. Std. Err. Coef. Err.ef.
1: Tenure as a board member −0.002 0.022 −0.016 0.009
2: Tenure as an employee 0.010 0.006 * 0.010 0.005 *
3: Tenure as a board member × initial dummy −0.005 0.022 0.007 0.010
4: Tenure as an employee × initial dummy −0.013 0.006 ** −0.012 0.005 ***

Age 0.116 0.026 *** 0.107 0.016 ***
Board size 0.036 0.020 * 0.039 0.046 **
No. of firms as a board member −0.358 0.094 *** −0.361 0.082 ***
Board member in the acquirer before the M&A −0.229 0.074 *** −0.232 0.064 ***
Acquisition 0.579 0.156 *** 0.578 0.139 ***
Target ROA −0.466 1.000 −0.463 4.068
Related −0.244 0.106 ** −0.252 0.097 ***
Target firm's relative size in assets −1.016 0.437 ** −1.006 0.374 ***
Log target assets −0.036 0.053 −0.040 0.049
Upper −0.279 0.059 *** −0.275 0.133 ***
Log stock share of top 10 −0.009 0.057 −0.008 0.195
Acquirer or others ROA 1.223 1.439 1.370 1.219
Log top 10 share of acquirer or others −0.028 0.047 −0.035 0.043
χ˜fj {\tilde \chi _{fj}} 0.036 0.033

Age × I(t = 0) −0.076 0.026 *** −0.069 0.017 ***
Board size × I(t = 0)
No. of firms as a board member × I(t = 0)
Board member in the acquirer before the M&A
Acquisition × I(t = 0)
Target ROA × I(t = 0)
Related × I(t = 0)
Target firm's relative size in assets × I(t = 0) −0.294 0.484 −0.289 0.414
Log target assets × I(t = 0)
Upper × I(t = 0)
Log stock share of top 10 × I(t = 0)
Acquirer or others ROA × I(t = 0)
Log top 10 share of acquirer or others × I(t = 0)
χ˜fj×It=0 {\tilde \chi _{fj}} \times I\left( {t = 0} \right) −0.027 0.034

Wald Test Coef. Chi2(1) Coef. Chi2(1)

1 + 3 −0.006 1.18 −0.009 5.58 **
2 + 4 −0.003 2.95 * −0.003 4.08 **

Number of observations 2108 2108

Stratified by year, acquirer industry code, and target industry code.

***, **, * denote statistical significance at the 1%,5%, and 10% significance levels.

Standard errors are calculated using bootstrap methods.

Specification (1) in Table 6 is our benchmark specification. The result for specification (1) shows that the coefficient of tenure as a board member is not significant, whereas the coefficient of tenure as an employee is positive and significant. The coefficient of tenure as a board member with an initial dummy is not significant, whereas the coefficient of tenure as an employee with an initial dummy is negative and significant. The sum of tenure as a board member and that with an initial dummy is negative, as is the sum of tenure as an employee and that with an initial dummy. A Wald test shows that the latter summed variable is statistically significant, whereas the former is not. The results indicate that, whereas the tenure as a board member has no impacts on the separation probability after M&As, irrespective of timing, a longer tenure as an employee increases the separation rate and the probability of appointment as a board member in a new firm, with statistical significance.

The coefficient for age is positive and significant, whereas that for age with an initial dummy is negative and significant. The sum of these variables is positive. Therefore, older managers are less likely to be retained either in the initial year after an M&A or in the years after.

Because many other control variables are insignificant in Specification (1), we suspect that there may be some multicollinearity among variables. Hence, we drop the variables with initial dummies for which both the level and interaction terms are insignificant in Specification (1). The result is shown as Specification (3) in Table 7. It is shown that the results on tenures are the same and, therefore, are robust.

The coefficients of many other control variables become significant in Specification (3). Specifically, the coefficient of board size is positive and significant. This may imply that the productivity of managers decreases with the number of managers. The results also indicate that managers who serve as board members in different companies have a lower separation rate. We can see that managers who were in the acquirer before the M&A and managers who belonged to a target firm in the same industry as the new firm are likely to be retained on the board of the new firm. This may imply that acquirer-specific knowledge and industry-specific knowledge are appreciated by the new firm. We can also see that the target managers are less likely to be retained in cases of acquisition and that higher-level managers (Upper) are more likely to be retained.

Specifications (2) and (4) are the same as Specifications (1) and (3), respectively, except that the former do not include the estimated unobserved skill, χ˜fj {\tilde \chi _{fj}} . We can see that the estimated coefficients for Specifications (2) and (4) are almost the same as those for (1) and (3), respectively, except for the coefficients on tenure as a board member and its interaction term. The Wald test for the sum of the coefficients of tenure as a board member and its interaction term become negative and significant in both (2) and (4), implying that longer board experiences increase the appointment probability of a board member in a new firm. However, this seems to reflect a selection bias resulting from an unobserved skill because these effects of board experiences disappear in (1) and (3), where we control for unobserved skills.

Now, we wish to interpret the results using our theory. We first apply Proposition 5 to interpret the results of our estimation. Because the conditions in Proposition 5 must be satisfied by the coefficients of both tenure as a board member and tenure as an employee, if the coefficients of either type of tenure can reject the conditions, then the corresponding hypothesis in Proposition 5 is rejected. First, let us test hypotheses 1 and 2 in Proposition 5 using the result for tenure as an employee. Our estimated results on the coefficient of tenure as an employee, the coefficient of tenure as an employee with the initial dummy, and the sum of these two coefficients correspond to λΠnhτe {{\partial \left[ { - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \right]} \over {\partial {\tau _e}}} , λΔπhτe {{\partial \left[ { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right]} \over {\partial {\tau _e}}} and λΔπh+Πnhτe {{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _e}}} , respectively. Our estimated coefficients are all statistically significant and the signs we obtained are λΠnhτe>0 {{\partial \left[ { - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \right]} \over {\partial {\tau _e}}} > 0 , λΔπhτe<0 {{\partial \left[ { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right]} \over {\partial {\tau _e}}} < 0 , and λΔπh+Πnhτe<0 {{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _e}}} < 0 . Hence, they definitely reject hypothesis fTnhT=fTlhT=0 f_{Tn}^{\prime}\left( {{h^T}} \right) = f_{Tl}^{\prime}\left( {{h^T}} \right) = 0 in Proposition 5. Therefore, the evidence suggests that new firms value firm-specific human capital. Although the directions of the coefficients cannot reject hypotheses fGlhG=fGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) = f_{Gn}^{\prime}\left( {{h^G}} \right) and FhG=0 F'\left( {{h^G}} \right) = 0 separately, they can reject the joint hypotheses of fGlhG=fGnhG f_{Gl}^{\prime}\left( {{h^G}} \right) = f_{Gn}^{\prime}\left( {{h^G}} \right) and FhG=0 F'\left( {{h^G}} \right) = 0 in Proposition 5. This suggests that a new firm values general human capital either because new firms appreciate learning capability and/or because it is more difficult to hire managers with higher general human capital.

Next, we separately investigate the role of tenure as a board member and that of tenure as an employee after M&As. Let us first apply Proposition 6 to the estimated results for the tenure as an employee. It shows that the evidence for tenure as an employee, λΠnhτe>0 {{\partial \left[ { - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \right]} \over {\partial {\tau _e}}} > 0 , λΔπhτe<0 {{\partial \left[ { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right]} \over {\partial {\tau _e}}} < 0 , and λΔπh+Πnhτe<0 {{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _e}}} < 0 , rejects the hypothesis that experience as an employee does not increase firm-specific skills (ηe = 0). Similarly, applying Proposition 7, the evidence for tenure as an employee, λΠnhτe>0 {{\partial \left[ { - \lambda {\Pi ^n}\left( {\bf{h}} \right)} \right]} \over {\partial {\tau _e}}} > 0 , λΔπhτb<0 {{\partial \left[ { - \lambda \Delta \pi \left( {\bf{h}} \right)} \right]} \over {\partial {\tau _b}}} < 0 , and λΔπh+Πnhτe<0 {{\partial \left[ { - \lambda \left( {\Delta \pi \left( {\bf{h}} \right) + {\Pi ^n}\left( {\bf{h}} \right)} \right)} \right]} \over {\partial {\tau _e}}} < 0 , rejects the hypothesis that ωe > μ, although we cannot reject the hypothesis that ωe < μ. The results consistently show that longer tenure as an employee of a target firm hampers the accumulation of general human capital relative to other experiences.

On the other hand, the relevant coefficients for tenure as a board member are not statistically significant, and we are not able to reject the hypothesis that experience as a board member does not increase firm-specific skills and does not have a larger influence on the accumulation of general human capital relative to other experiences. Without rejections of any null hypotheses, we cannot make a strong argument in relation to the effects of experience as a board member. However, our result seems to be consistent with the argument that experience as a board member does not involve firm-specific human capital and does not hamper the accumulation of general human capital. Hence, it is likely that managerial experiences are more general than those as an employee.

In summary, we can interpret our results as follows. 1) After an M&A, Japanese firms value both the target firm-specific human capital and the general human capital of managers. 2) Experience as an employee increases firm-specific skills, but at the expense of the accumulation of general human capital. 3) Managerial experiences are likely to be more general than those as an employee.

Discussion

In this section, we discuss whether the moral hazard of managers could influence our interpretation. In line with the literature, including Wulf (2004), Hartzell et al. (2004), and Bargeron et al. (2009), we note that managers with long tenures in target firms may become very powerful and attempt to protect “their position” as much as possible in the event of M&As.

As this paper shows, as long as ex post bargaining is possible without any costs, we do not need to be concerned about this possibility. In fact, although we control for a measure of the ownership structure using the logarithm of the stock share of the top 10 stock holders, it does not have any significant impacts on the separation rate. This strongly suggests that the entrenchment story may not be a serious problem for M&As in the Japanese context.

Of course, this may not be taken as conclusive evidence to justify our presumption. If the ex post negotiation is costly, these managers would remain very powerful in the initial period and the acquiring firms would find it difficult to fire them. However, eventually, the power of these managers would be eroded such that, finally, it would become possible for the new firms to fire them. Thus, this kind of entrenchment story may be able to explain the coefficient of tenure in our survival analysis.

To the extent that the power of these managers is the source of their productivity, there is no reason to distinguish their power from their target firm-specific skills. In this case, the above story can be consistent with the story that we have presented. However, it is possible that a powerful manager could be unproductive. Unfortunately, we do not have conclusive evidence that can reject this kind of entrenchment story. Nevertheless, there are at least three reasons to consider that the entrenchment of powerful but unproductive managers is not important for our results. First, the entrenchment argument is consistent with our coefficient of tenure as an employee, but it is inconsistent with the coefficient of tenure as a board member. These results are peculiar because we would expect that tenure as a board member would be a more appropriate measure of power in a firm than tenure as an employee. Second, because it is impossible, in reality, to write an explicit contract for all possible contingencies, powerful managers should anticipate that they will eventually lose their power. If so, it is not theoretically clear why they would initially agree to a merger. Most mergers in Japan are friendly and, therefore, if the managers are very powerful, they could potentially oppose them. This suggests that managers do not possess the degree of power required to support the entrenchment argument. Finally, senior people typically have more authority in Japanese society than do younger people. Hence, it is possible that power in a firm could be captured by age rather than tenure. In fact, the coefficient of age is significantly positive, the interaction with the initial dummy is significantly negative, and the sum of the two coefficients is significantly positive, which indicates that senior managers are less likely to be appointed as board members of a new firm. This is to be expected when these managers are less productive, despite their power in the initial years. Therefore, we expect that, after controlling for age, the entrenchment story will have little effect on the coefficient of tenure.

Contribution of our Results to the Literature

In addition to our methodological contributions to the literature, our results have a number of important implications for several streams of literature, including those on managerial compensation and Japanese M&As. This section discusses how our results contribute to these fields.

First, evidence about the transferability of managerial skills between Japanese firms can provide a useful basis and comparison for the discussion of sources of high compensation for U.S. CEOs. As Bertrand (2009) and Frydman and Jenter (2010) show, some researchers consider the high compensation of U.S. CEOs to be the result of powerful managers setting their own pay rates, whereas others consider it to be the result of a competitive market for managerial talent. One of the key presumptions behind the market-based view (e.g., Gabaix and Landier, 2008; Terviö, 2008) is that important managerial skills are transferable across firms. Indeed, Murphy and Zábojník (2004) and Frydman (2005) argue that a rise in the importance of a CEO's general skills relative to his/her firm-specific skills can explain not only the increase in CEO compensation but also the increase in the turnover rate that has occurred since 1970. Kaplan et al. (2012) investigate a proprietary dataset for executives and provide evidence that general skills are important managerial skills. Our results, focusing on the Japanese context, provide evidence that complements this U.S.-based literature; the transferability of managerial human capital depends on the source of managerial human capital. We find that whereas managerial human capital accumulated through experiences as an employee is firm specific, managerial experiences are likely to be more general than those gained as employees, even though the promotion structure of Japanese firms is considered to facilitate the accumulation of firm-specific skills to a particularly large extent (e.g., Mincer and Higuchi, 1988).

It should be noted that Japanese managers receive far less cash compensation than do their U.S. counterparts

Kaplan (1994) shows that U.S. officers earned 13.5 times the average compensation of other employed males, whereas the ratio for Japanese executives is only 4.8. Moriguchi and Saez (2008) show that, whereas the wage income share of the top 1% of wage earners in the U.S. rose exponentially from 5% to 12% between 1970 and 2005, the share in Japan has remained nearly constant at around 5% during this period.

and that Japanese CEOs have very long tenures at their firms relative to CEOs in U.S. firms

Kaplan (1994) shows that the average tenure of a president in a Japanese firm is 34.3 years, whereas that of a U.S. CEO is 26 years.

. This suggests that, even if managerial experience in Japan is transferable across firms, such an appointment does not immediately imply high compensation and a high turnover rate. Accounting for the differences in the market for managers between the U.S. and Japan is beyond the scope of this paper. However, together with our evidence that experience gained as an employee is firm specific and is required to manage a newly merged firm, a plausible conjecture is that Japanese managers must conduct not only tasks that require managerial experiences, but also those that require the type of experience gained as an employee

In fact, Jacoby (2005) compares Japan's internal labor market with the U.S. internal labor market and highlights the stronger role of personnel departments in Japan.

, which make it more difficult to hire managers from outside the firm in Japan.

Finally, this is the first paper that examines the retention of managers after M&As using a Japanese dataset. The number of M&As in Japan has dramatically increased since the late 1990s. Although there are several attempts to understand Japanese M&A waves (e.g., Fukao et al. 2005), few papers investigate the separation of workers after M&As. Notable exceptions are Kubo (2004) and Kubo and Saito (2012), which analyze the effect of mergers on employment and wages. In particular, Kubo (2004) analyzes the personal characteristics of employees who separated from firms before and after mergers using firm personnel data. However, neither work discusses the retention of top managers. Although the impacts of mergers on employees are interesting in their own right, the reasons for the separation of managers from the boards of newly merged firms would differ from the reasons for the separation of employees. Hence, the evidence in our paper provides valuable new information to understand how Japanese firms adapt to merger waves.

Conclusion

This paper examines how the tenure of managers influences the retention rate of management groups after M&As in Japanese companies. It develops a general equilibrium model of managerial separation after M&As that distinguishes several hypotheses about the effect of tenure on separation, given several data limitations. Our empirical results show that acquiring firms obtain benefits from skills that are specific to the target firm and from skills that are specific to the new firm, which must be acquired by managers from the target firm following an M&A. We also show that experience as an employee in a target firm, prior to becoming a manager, increases valuable specific skills but at the expense of the accumulation of managers’ general human capital.

To confirm the external validity of the empirical results found in data-rich countries such as the U.S., it is important to analyze managerial human capital in multiple countries, including data-poor countries such as Japan. However, it is likely that the data availability would impose several limitations on such a study, as they have in our case. This paper provides useful methods to overcome these possible data limitations. In particular, we provide a novel method to correct for selection biases by utilizing the timing of selection in a selected sample, which does not require a random sample from the population. We hope that our approach can assist in accumulating empirical evidence in countries with limited data.

Of course, the development of the model is based on several assumptions. Hence, we do not claim to provide conclusive evidence on the retention rate of managers after M&As. However, because the importance of human capital in understanding the retention of managers cannot be denied, we believe that our results provide a reasonable explanation of the otherwise puzzling evidence on the retention rate of managers after M&As. Moreover, because our model is flexible and amendable, it can provide a sound basis for the development of a more complete theory of retention in the future.