
. 
. 
. * ── Industry variation ──────────────────────────────────────────────────────
. 
. use "Intermediate/temp19.dta", clear

. 
. gen sicclass="Agriculture" if sic2<10
(15,437 missing values generated)

. replace sicclass="Mining" if sic2>=10 & sic2<=14
(583 real changes made)

. replace sicclass="Construction" if sic2>=15 & sic2<=17
variable sicclass was str11 now str12
(285 real changes made)

. replace sicclass="Manufacturing" if sic2>=20 & sic2<40
variable sicclass was str12 now str13
(6,015 real changes made)

. replace sicclass="Transportation and Utilities" if sic2>=40 & sic2<50
variable sicclass was str13 now str28
(1,389 real changes made)

. replace sicclass="Wholesale Trade" if sic2>=50 & sic2<=51
(510 real changes made)

. replace sicclass="Retail Trade" if sic2>=52 & sic2<60
(1,192 real changes made)

. replace sicclass="Finance and Real Estate" if sic2>=60 & sic2<70
(3,113 real changes made)

. replace sicclass="Services" if sic2>=70 & sic2<90
(2,324 real changes made)

. replace sicclass="Other" if sic2==99
(26 real changes made)

. encode sicclass, gen(Industry)

. 
. 
. collapse (mean)std_net_seed std_net_chatgpt, by(sicclass)

. drop if sicclass=="Other" | sicclass=="Agriculture"
(2 observations deleted)

. sort std_net_seed std_net_chatgpt

. export delimited "Intermediate/industry_plot_df.csv", replace
file Intermediate/industry_plot_df.csv saved

. 
. 
. * ── Time variation ──────────────────────────────────────────────────────────
. 
. use "Intermediate/temp19.dta", clear

. collapse (mean)std_net_seed std_net_chatgpt, by(year)

. export delimited "Intermediate/year_plot_df.csv", replace
file Intermediate/year_plot_df.csv saved

. 
. 
. * ── Variance decomposition ──────────────────────────────────────────────────
. 
. use "Intermediate/temp19.dta", clear

. eststo: reg std_net_chatgpt i.year

      Source |       SS           df       MS      Number of obs   =    15,446
-------------+----------------------------------   F(12, 15433)    =      6.29
       Model |  73.4454055        12  6.12045046   Prob > F        =    0.0000
    Residual |  15013.3652    15,433  .972809256   R-squared       =    0.0049
-------------+----------------------------------   Adj R-squared   =    0.0041
       Total |  15086.8107    15,445  .976808718   Root MSE        =    .98631

------------------------------------------------------------------------------
std_net_ch~t | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
       2009  |  -.0121203   .0524552    -0.23   0.817    -.1149386    .0906981
       2010  |  -.1119732   .0497957    -2.25   0.025    -.2095785   -.0143678
       2011  |  -.1479593   .0490007    -3.02   0.003    -.2440066   -.0519121
       2012  |  -.0614101   .0481302    -1.28   0.202     -.155751    .0329307
       2013  |  -.0855331   .0476054    -1.80   0.072    -.1788454    .0077791
       2014  |  -.0166975   .0469162    -0.36   0.722    -.1086587    .0752638
       2015  |    .006578    .046364     0.14   0.887     -.084301    .0974569
       2016  |   .0557977   .0461607     1.21   0.227    -.0346828    .1462782
       2017  |   .0106766   .0459671     0.23   0.816    -.0794243    .1007775
       2018  |   .0340141   .0459264     0.74   0.459    -.0560071    .1240353
       2019  |  -.0111593   .0458612    -0.24   0.808    -.1010526    .0787341
       2020  |   .1224331   .0460549     2.66   0.008       .03216    .2127061
             |
       _cons |   .0121866    .037907     0.32   0.748    -.0621156    .0864888
------------------------------------------------------------------------------
(est1 stored)

. eststo: areg std_net_chatgpt i.year, absorb(sic2)

Linear regression, absorbing indicators             Number of obs     = 15,446
Absorbed variable: sic2                             No. of categories =     65
                                                    F(12, 15369)      =   6.44
                                                    Prob > F          = 0.0000
                                                    R-squared         = 0.0298
                                                    Adj R-squared     = 0.0250
                                                    Root MSE          = 0.9759

------------------------------------------------------------------------------
std_net_ch~t | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
       2009  |  -.0144428   .0519222    -0.28   0.781    -.1162164    .0873308
       2010  |  -.1097654   .0492957    -2.23   0.026    -.2063907   -.0131401
       2011  |  -.1434404   .0485199    -2.96   0.003    -.2385452   -.0483356
       2012  |   -.056664   .0476689    -1.19   0.235    -.1501006    .0367726
       2013  |  -.0837627   .0471449    -1.78   0.076    -.1761723    .0086468
       2014  |  -.0122956   .0464766    -0.26   0.791    -.1033953    .0788042
       2015  |   .0106947    .045945     0.23   0.816     -.079363    .1007524
       2016  |   .0595286   .0457409     1.30   0.193     -.030129    .1491861
       2017  |   .0158266   .0455519     0.35   0.728    -.0734606    .1051137
       2018  |   .0393992   .0455148     0.87   0.387    -.0498151    .1286136
       2019  |  -.0061292   .0454549    -0.13   0.893    -.0952261    .0829677
       2020  |   .1257505   .0456497     2.75   0.006     .0362716    .2152293
             |
       _cons |   .0085646   .0375644     0.23   0.820    -.0650661    .0821953
------------------------------------------------------------------------------
F test of absorbed indicators: F(64, 15369) = 6.163           Prob > F = 0.000
(est2 stored)

. eststo: areg std_net_chatgpt i.year, absorb(gvkey)

Linear regression, absorbing indicators             Number of obs     = 15,446
Absorbed variable: gvkey                            No. of categories =  1,572
                                                    F(12, 13862)      =   5.74
                                                    Prob > F          = 0.0000
                                                    R-squared         = 0.2429
                                                    Adj R-squared     = 0.1565
                                                    Root MSE          = 0.9077

------------------------------------------------------------------------------
std_net_ch~t | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
       2009  |  -.0313576   .0487525    -0.64   0.520     -.126919    .0642039
       2010  |  -.1237443   .0463248    -2.67   0.008    -.2145472   -.0329414
       2011  |  -.1423482   .0456653    -3.12   0.002    -.2318584    -.052838
       2012  |  -.0736488   .0449146    -1.64   0.101    -.1616875    .0143898
       2013  |   -.108849   .0444938    -2.45   0.014    -.1960628   -.0216351
       2014  |  -.0298531   .0439449    -0.68   0.497     -.115991    .0562847
       2015  |  -.0115146   .0435314    -0.26   0.791     -.096842    .0738127
       2016  |   .0353839   .0433858     0.82   0.415    -.0496581    .1204258
       2017  |  -.0027031   .0432405    -0.06   0.950    -.0874603    .0820542
       2018  |   .0135878   .0432494     0.31   0.753    -.0711869    .0983626
       2019  |   -.032524   .0432559    -0.75   0.452    -.1173114    .0522634
       2020  |   .0866773   .0435073     1.99   0.046     .0013972    .1719575
             |
       _cons |    .029107   .0356291     0.82   0.414    -.0407309    .0989448
------------------------------------------------------------------------------
F test of absorbed indicators: F(1571, 13862) = 2.775         Prob > F = 0.000
(est3 stored)

. eststo: areg std_net_chatgpt i.year, absorb(co_per_rol)

Linear regression, absorbing indicators             Number of obs     = 15,446
Absorbed variable: co_per_rol                       No. of categories =  3,105
                                                    F(12, 12329)      =   5.42
                                                    Prob > F          = 0.0000
                                                    R-squared         = 0.3643
                                                    Adj R-squared     = 0.2037
                                                    Root MSE          = 0.8820

------------------------------------------------------------------------------
std_net_ch~t | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
       2009  |  -.0504605   .0494408    -1.02   0.307    -.1473721    .0464512
       2010  |  -.1232479   .0479974    -2.57   0.010    -.2173303   -.0291656
       2011  |  -.1650078   .0477703    -3.45   0.001    -.2586451   -.0713705
       2012  |  -.0882295   .0476557    -1.85   0.064    -.1816421    .0051831
       2013  |  -.1158154   .0478111    -2.42   0.015    -.2095326   -.0220981
       2014  |  -.0429943   .0476923    -0.90   0.367    -.1364788    .0504901
       2015  |  -.0112162   .0477636    -0.23   0.814    -.1048402    .0824079
       2016  |   .0280361   .0480968     0.58   0.560    -.0662411    .1223133
       2017  |  -.0043266   .0484009    -0.09   0.929       -.0992    .0905468
       2018  |   .0212509     .04892     0.43   0.664      -.07464    .1171417
       2019  |  -.0032727   .0495786    -0.07   0.947    -.1004546    .0939092
       2020  |   .1181645   .0507749     2.33   0.020     .0186378    .2176912
             |
       _cons |   .0285388   .0396308     0.72   0.471    -.0491437    .1062213
------------------------------------------------------------------------------
F test of absorbed indicators: F(3104, 12329) = 2.246         Prob > F = 0.000
(est4 stored)

. esttab, b(3) star(* 0.10 ** 0.05 *** 0.01) r2 se nonotes nogaps depvars drop(*.year) title("Org Capital(ChatGPT)"), using "$root/results/incremental_variation_1.csv", replace
(output written to <SET YOUR PATH>/results/incremental_variation_1.csv)

. eststo clear

. 
. eststo: reg std_net_seed i.year

      Source |       SS           df       MS      Number of obs   =    15,446
-------------+----------------------------------   F(12, 15433)    =     23.63
       Model |  240.824116        12  20.0686763   Prob > F        =    0.0000
    Residual |  13104.6758    15,433  .849133402   R-squared       =    0.0180
-------------+----------------------------------   Adj R-squared   =    0.0173
       Total |  13345.4999    15,445  .864066035   Root MSE        =    .92148

------------------------------------------------------------------------------
std_net_seed | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
       2009  |  -.0922881   .0490075    -1.88   0.060    -.1883486    .0037724
       2010  |  -.0526353   .0465228    -1.13   0.258    -.1438255    .0385548
       2011  |  -.0284791   .0457801    -0.62   0.534    -.1182135    .0612553
       2012  |   .0150602   .0449668     0.33   0.738      -.07308    .1032004
       2013  |  -.0154303   .0444765    -0.35   0.729    -.1026095    .0717488
       2014  |   .0092771   .0438325     0.21   0.832    -.0766398    .0951941
       2015  |   .1523379   .0433167     3.52   0.000     .0674321    .2372437
       2016  |   .1724544   .0431268     4.00   0.000     .0879208    .2569879
       2017  |    .185981   .0429458     4.33   0.000     .1018021    .2701599
       2018  |   .1952935   .0429078     4.55   0.000     .1111891    .2793979
       2019  |   .1926763   .0428469     4.50   0.000     .1086914    .2766613
       2020  |   .3479012   .0430279     8.09   0.000     .2635614    .4322409
             |
       _cons |  -.0998583   .0354155    -2.82   0.005    -.1692769   -.0304397
------------------------------------------------------------------------------
(est1 stored)

. eststo: areg std_net_seed i.year, absorb(sic2)

Linear regression, absorbing indicators             Number of obs     = 15,446
Absorbed variable: sic2                             No. of categories =     65
                                                    F(12, 15369)      =  23.79
                                                    Prob > F          = 0.0000
                                                    R-squared         = 0.0332
                                                    Adj R-squared     = 0.0284
                                                    Root MSE          = 0.9163

------------------------------------------------------------------------------
std_net_seed | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
       2009  |  -.0920108   .0487478    -1.89   0.059    -.1875622    .0035406
       2010  |  -.0508104   .0462818    -1.10   0.272    -.1415283    .0399075
       2011  |   -.025374   .0455535    -0.56   0.578    -.1146643    .0639163
       2012  |   .0178642   .0447545     0.40   0.690      -.06986    .1055883
       2013  |  -.0138466   .0442626    -0.31   0.754    -.1006065    .0729133
       2014  |   .0123947   .0436352     0.28   0.776    -.0731354    .0979248
       2015  |   .1546843   .0431361     3.59   0.000     .0701325    .2392361
       2016  |    .174279   .0429444     4.06   0.000     .0901029    .2584551
       2017  |   .1881508    .042767     4.40   0.000     .1043225    .2719792
       2018  |   .1980152   .0427321     4.63   0.000     .1142552    .2817752
       2019  |   .1948688   .0426759     4.57   0.000      .111219    .2785185
       2020  |      .3498   .0428588     8.16   0.000     .2657917    .4338083
             |
       _cons |  -.1019748   .0352678    -2.89   0.004    -.1711039   -.0328457
------------------------------------------------------------------------------
F test of absorbed indicators: F(64, 15369) = 3.762           Prob > F = 0.000
(est2 stored)

. eststo: areg std_net_seed i.year, absorb(gvkey)

Linear regression, absorbing indicators             Number of obs     = 15,446
Absorbed variable: gvkey                            No. of categories =  1,572
                                                    F(12, 13862)      =  26.04
                                                    Prob > F          = 0.0000
                                                    R-squared         = 0.2367
                                                    Adj R-squared     = 0.1495
                                                    Root MSE          = 0.8573

------------------------------------------------------------------------------
std_net_seed | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
       2009  |  -.0975831   .0460425    -2.12   0.034    -.1878326   -.0073335
       2010  |  -.0311494   .0437498    -0.71   0.476    -.1169049    .0546061
       2011  |   .0024756   .0431269     0.06   0.954    -.0820591    .0870102
       2012  |   .0428909   .0424179     1.01   0.312     -.040254    .1260358
       2013  |   .0058003   .0420205     0.14   0.890    -.0765656    .0881662
       2014  |   .0379551   .0415021     0.91   0.360    -.0433947    .1193048
       2015  |   .1750239   .0411116     4.26   0.000     .0944396    .2556083
       2016  |   .1962027   .0409741     4.79   0.000     .1158879    .2765175
       2017  |   .2155976    .040837     5.28   0.000     .1355517    .2956436
       2018  |    .221228   .0408454     5.42   0.000     .1411656    .3012905
       2019  |   .2238537   .0408515     5.48   0.000     .1437793    .3039281
       2020  |    .370812   .0410889     9.02   0.000     .2902723    .4513518
             |
       _cons |  -.1232985   .0336486    -3.66   0.000    -.1892544   -.0573427
------------------------------------------------------------------------------
F test of absorbed indicators: F(1571, 13862) = 2.527         Prob > F = 0.000
(est3 stored)

. eststo: areg std_net_seed i.year, absorb(co_per_rol)

Linear regression, absorbing indicators             Number of obs     = 15,446
Absorbed variable: co_per_rol                       No. of categories =  3,105
                                                    F(12, 12329)      =  14.87
                                                    Prob > F          = 0.0000
                                                    R-squared         = 0.3590
                                                    Adj R-squared     = 0.1970
                                                    Root MSE          = 0.8330

------------------------------------------------------------------------------
std_net_seed | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
       2009  |  -.1141547   .0466949    -2.44   0.015    -.2056839   -.0226254
       2010  |  -.0499992   .0453316    -1.10   0.270    -.1388562    .0388578
       2011  |  -.0358914   .0451172    -0.80   0.426    -.1243281    .0525453
       2012  |    .018667   .0450089     0.41   0.678    -.0695576    .1068915
       2013  |  -.0238088   .0451557    -0.53   0.598     -.112321    .0647035
       2014  |   .0019346   .0450435     0.04   0.966    -.0863578    .0902269
       2015  |   .1483536   .0451108     3.29   0.001     .0599294    .2367778
       2016  |   .1625558   .0454255     3.58   0.000     .0735148    .2515969
       2017  |   .1832285   .0457128     4.01   0.000     .0936243    .2728326
       2018  |   .1871181    .046203     4.05   0.000      .096553    .2776833
       2019  |   .1931753   .0468251     4.13   0.000     .1013909    .2849598
       2020  |   .3445469   .0479549     7.18   0.000     .2505479     .438546
             |
       _cons |   -.094968   .0374297    -2.54   0.011    -.1683361      -.0216
------------------------------------------------------------------------------
F test of absorbed indicators: F(3104, 12329) = 2.113         Prob > F = 0.000
(est4 stored)

. esttab, b(3) star(* 0.10 ** 0.05 *** 0.01) r2 se nonotes nogaps depvars drop(*.year) title("Org Capital(Seed Word)"), using "$root/results/incremental_variation_1.csv", append
(output written to <SET YOUR PATH>/results/incremental_variation_1.csv)

. eststo clear

. 
. 
. 
. quietly log close
