Dirk Enzmann - Statistical Software (Some Useful Things)

Below you find some small executables, SPSS macros and scripts, Excel-templates, R functions (see: http://www.r-project.org/) and Stata ado-files I wrote for special calculations in statistical analyses. The executable programs are written in Pascal 7.0 and run under 16- and 32-bit Windows (3.x, 9x, NT4, XP). The files can be downloaded and spread without further permisson under the condition that they remain unchanged. They have been tested as virus free. The author is not liable to any damages caused by their use. Comments on improvements are welcome.

NameDescriptionApplicationDownload
BetaDiff For calculating confidence intervals and testing the significance of the difference of two beta-coefficients from independent samples (description).Executable BetaDiff.zip
Center For centering a set of variables (with listwise deletion of missing cases); useful for computing products of variables for interaction terms in regression analyses.SPSScenter.sps
clstop_lbtStata module to determine via -cluster stop, rule(lbt)- the number of kmeans clusters (or to determine whether there is more than one kmeans cluster) according to the lower bound technique presented in Steinley & Brusco (2011).
(To install you may copy the .ado- and the .sthlp-file into your "\ado\plus\c\" folder - the recommended method, however, is to enter ssc install clstop_lbt in Stata's command window.)
Stataclstop_lbt.ado
clstop_lbt.sthlp
CorrTotFor computing pooled means, standard deviations and a pooled correlation matrix from means, standard deviations and correlation matrices of two independent samples (description).R
Executable
corrtot.r
CorrTot.zip
CovMatFor writing a covariance matrix of a set of variables (with listwise deletion of missing cases) to a text file.SPSScovmat.sps
CrosstabsR function to simulate the SPSS procedure CROSSTABS.Rcrosstabs.r
dta2spsStata module  to create SPSS syntax and a Stata data file to convert Stata data into SPSS data. Extended missing values which are labeled will be recoded into "numeric" values which will be defined as missing by using SPSS syntax created by -dta2sav-. This allows to preserve labels of missing values as defined in Stata for subsequent use in SPSS.
(To install you may copy the .ado- and the .sthlp-file into your "\ado\plus\d\" folder - the recommended method, however, is to enter ssc install dta2sav in Stata's command window.)
Statadta2sav.do
dta2sav.sthlp
DumCodeFor creating dummy variables (indicator coding) of a nominal variable. Useful for regression analyses with independent variables that are categorical.SPSSdumcode.sps
Fa.promaxTo compute maximum likelihood factor analysis with varimax and promax rotation; allows specification of promax power and sorting of loadings; output  includes correlation matrix of factors and (optionally) matrices of factor scoresRfa.promax.r
Freq R function to simulate the SPSS procedure FREQUENCIES.Rfreq.r
Hist.kdncTo plot a histogram overlayed by a kernel density and a normal curve.Rhist.kdnc.r
IntGraphTemplate for drawing interaction plots of a regression equation with interaction term (description).Excelintgraph.zip
KurtosisTo compute the unbiased population estimate or biased sample statistic of kurtosis.Rkurtosis.r
LogRegR2 To calculate ChiČ model fit and RČ analogs (pseudo RČ: McFadden's RČ, Cox & Snell index, Nagelkerke index, McKelvey & Zavoina's RČ) of a logistic regression model obtained by glm(..., family = 'binomial').RLogRegR2.r
MeanSDFor computing interactively the mean and standard deviation of a combined sample from up to 50 independent samples.Executable meansd.zip
MeanSDF Same as MeanSD for up to 1000 samples and input file as input (description).Executablemeansdf.zip
MedianFor calculating the median and quartiles of a variable (optionally for all values of a break variable) according to one of six different methods (description). SPSSmedian.sps
MEResc To rescale the results of mixed (multilevel) nonlinear probability models such as xtmelogit, xtlogit, or xtprobit to the same scale as the intercept-only model. This allows to compare regression coefficients or variance components across hierarchically nested models [see: Hox, J. J. (2010). Multilevel Analysis: Techniques and Applications (Chapter 6.5, pp. 133-139). New York (2nd ed.): Routledge].
(To install you may copy the .ado-, .mo- and .sthlp-files into your "\ado\plus\m\" folder - the recommended method, however, is to enter ssc install meresc in Stata's command window.)
Statameresc.zip
Miss2SysScript to recode all missing values of all numeric variables to system missing values (useful if you want to import an SPSS data file with different missing values in R) (description).SPSSMiss2Sys.sbs
Moments2To calculate the mean, standard deviation, and different types of skewness and kurtosis (according to Joanes & Gill, 1988) of a list of variables. The default are estimates of skewness and kurtosis as used in SAS and SPSS.
(To install you may copy the .ado- and the .hlp-file into your "\ado\plus\m\" folder - the recommended method, however, is to enter ssc install moments2 in Stata's command window.)
Statamoments2.ado
moments2.hlp
nb_adjustFor identifying and adjusting (or removing) outliers of a variable assumed to have a negative binomial distribution.
(Requires Stata version 12.1 or higher. To install you may copy all files of the .zip-file starting with "n" into the "\ado\plus\n\" folder and all files starting with "r" into the "\ado\plus\r\" folder.)
Statanb_adjust.zip
Part_tstFor testing the difference between two standardized regression coefficients of the same equation (one sample) (description). SPSSpart_tst.zip
PCATo compute a principal components "factor" analysis (PCA) with varimax and promax rotation; different options for the number of components (factors): direct specification, parallel test criteria (random eigenvalues), or minimum eigenvalue; optionally specification of promax power, sorting of loadings, and matrices of factor scores (see also: RanEigen and Fa.promax).Rpca.r
Plot.fitPNB To plot the proportion of the observed counts and the fitted (expected) probabilities of Poisson and negative binomial distributed counts of a variable.Rplot.fitPoisNegb.r
Plot.kdnc To plot a kernel density curve overlayed by a normal curve.Rplot.kdnc.r
Plot.powerTo calculate and plot power of a one sample z-test of a sample mean.Rplot.power.r
Plot_PowerCreate graph to demonstrate power analysis (one-sample z-test of a mean) - see demonstration in pow_demo.do.Stataplot_power.do
pow_demo.do
ProfSim For calculating different measures of profile similarity based on two sets of variables (description: see comments at the end of the macro).SPSSprofsim.sps
prop.CITo calculate the confidence interval of a single proportion according to one of eleven methods (see: Brown, Cai, & DasGupta, 2001; Newcombe, 1998) (default: likelihood ratio method) (description: see comments of source file).Rprop.CI.r
ex_prop.CI.r
R2_mzTo compute McKelvey & Zavoina's Pseudo-RČ for multilevel logistic regression, random effects, and fixed effects logit and probit models (see Windmeijer, 1995).
(To install you may copy the .ado-, .mo- and .sthlp-files into your "\ado\plus\r\" folder - the recommended method, however, is to enter ssc install r2_mz in Stata's command window.)
Statar2_mz.zip
RanEigenFor determining the number of components (factors) to retain in a principal component analysis (PCA) by using random eigenvalues (parallel analysis) (APM article describing version 1.0)  (how to install RanEigen?).Executable
R
pacrit.zip
RanEigen.r
Rel_ClustStata module to compute indices of relative clusterability of a set of variables according to Steinley & Brusco (2008) and to transform a set of variables to z-standardized, range standardized, or to variance-to-range ratio weighted variables for use in (K-means) cluster analysis.
(To install you may copy the .ado- and the .hlp-file into your "\ado\plus\r\" folder - the recommended method, however, is to enter ssc install rel_clust in Stata's command window.)
Statarel_clust.ado
rel_clust.sthlp
RelDiffFor computing the reliability of a difference score (gain score) according to Zimmerman & Williams (1982). Executablereldiff.zip
ReliabilityR function to simulate the SPSS procedure RELIABILITY.R reliability.r
r_bisFor computing a biserial correlation coefficient and its significance.SPSS r_bis.sps
examp_r.sps
R_ProbFor calculating the significance, 95%-confidence interval, and Fisher's Z value of a Pearson correlation coefficient r (given sample size n). Executabler_prob.zip
r_tetraFor computing a tetrachoric correlation coefficient and its significance (see also: TetCorr).SPSS r_tetra.sps
examp_r.sps
scores (R)To create scores (min, max, sum, sd, or mean) of variables. The user can specify the minimum number of valid values necessary for the score to be valid. If mean scores are requested it is possible to center them at the overall mean, to transform them to z-scores, or to transform them to POMP (percent of maximum possible) scores.Rscores.r
test_sc.r
scores (Stata)To create scores (row-wise) of a set of variables. The user can specify the minimum number of valid values necessary for the score to be valid. The scores created can be: minimum, maximum, total (sum), median, percentile, standard deviation, or mean. If mean scores are requested it is possible to center them at the overall mean or to transfrom them to z-scores, POMP (percent of maximum possible) scores, the proportion of maximum possible scores, or the shrunken proportion of maximum possible scores.
(To install you may copy the .ado- and the .hlp-file into your "\ado\plus\s\" folder - the recommended method, however, is to enter ssc install scores in Stata's command window.)
Stata scores.ado
scores.hlp
sim_BETo simulate series of Bernoulli experiments and plot the cumulative sequence of success rates (optionally including confidence intervals).Statasim_be.do
be_demo.do
sim_CITo demonstrate the concept of confidence intervals (CIs) by simulation. The program creates (animated) plots of confidence intervals (employing either t- or normal-distribution) by drawing a user specified number of samples of user specified size from the normal distribution with user specified mu and sigma. Optional output contains sample statistics and coverage rate of confidence intervals.R

Stata
sim_CI.r
CI_demo.r
sim_ci.do
ci_demo.do
SkewnessTo compute the unbiased population estimate or biased sample statistic of skewness.Rskewness.r
SortLTo sort rotated factor loadings (pattern matrix) or components previously created by the postestimation command -rotate-. Sorting of loadings or components by size facilitates the interpretation of a factor solution.
(To install you may copy the .ado- and the .hlp-file into your "\ado\plus\s\" folder - the recommended method, however, is to enter ssc install sortl in Stata's command window.)
Stata sortl.ado
sortl.hlp
SPSS2Stata Script for converting an SPSS data file (.sav) into a Stata/SE data file (.dta). The script now supports variable names longer than 8 characters. Nevertheless, you may find the Stata ado -usespss- useful, too (to install enter ssc install usespss in Stata's command window). However, in contrast to this script and similar to StatTransfer -usespss- ignores value labels of missing values (description).SPSSspss2stata.sbs
t-TestFor testing the difference in means between two indepedent samples (given means, standard deviations and sample sizes of both samples) (description).Executablet_test.zip
TabNotesTo convert .not-files created by the data entry software EpiData (see: http://www.epidata.dk/index.htm) containing data entry notes into a tabulator-delimited file (for example, to export the notes into an Excel file) (description).ExecutableTabNotes.zip
TetCorr DOS program and source code (Pascal) for computing a matrix of tetrachoric correlation coefficients of up to 50 variables and a maximum of 8,000 cases (see also: r_tetra) (description). Executabletetcorr.zip
TetVNPosTo determine which variables are responsible for a matrix of tetrachoric correlations not being positive definite (dependencies: packages -psych- and -mvtnorm-)RTetVNPos.r
TRd For computing the Satorra-Bentler scaled chi-square difference test (TRd) based on the MLM estimators obtained by MPlus, see: http://www.statmodel.com/chidiff.html.Executabletrd.zip
VDef2SPSScript for creating SPSS syntax to define the variables (variable labels, value labels, and missing values) according to the definitions of a specific SPSS data file (*.sav) (description).SPSSVDef2SPS.sbs

Some other useful things:
Locations of visitors to this page

(last update: July 19, 2013)