-
BW=number
specifies the bandwidth to apply to each variable in each kernel
density estimate. Larger bandwidths produce a smoother estimate, whereas smaller bandwidths produce a rougher estimate. To specify different bandwidths for different variables, specify BW=number as a v-option. By default, the bandwidth is set automatically by the Sheather-Jones plug-in method (see the section Bandwidth Selection).
-
BWM=number
specifies the bandwidth multiplier to apply to the corresponding bandwidth
for each variable. Values of number greater than 1 increase the effective bandwidth and produce a smoother estimate. Values less than 1 decrease the effective bandwidth and produce a rougher estimate. To specify different bandwidth multipliers for different variables, specify BWM=number as a v-option. By default, BWM=1.
-
CDF
computes the distribution function in addition to
the density function for each variable. The distribution function is obtained by a seminumerical technique as described in the section Kernel Distribution Estimates.
-
GRIDL=number
specifies a lower grid limit for each kernel density estimate.
To specify different lower grid limits for different variables, specify GRIDL=number as a v-option. The default value for a particular variable is a function of both the kernel bandwidth and the minimum observed value for that variable.
-
GRIDU=number
specifies an upper grid limit for each kernel density estimate.
To specify different upper grid limits for different variables, specify GRIDU=number as a v-option. The default value for a particular variable is a function of both the kernel bandwidth and the maximum observed value for that variable.
-
LEVELS
LEVELS=(numlist)
computes a table of levels (called "Levels") for contours of the univariate density.
The number of contours is equal to the number of values in numlist, where each value in numlist specifies a percentage to be used in calculating the density area that is enclosed by the contour. The contours are defined such that the density has a constant level along each contour, and the area enclosed by each contour corresponds to the total density area minus the specified percentage of the total area. In other words, the contours correspond to slices or levels of the density surface that are taken along the density axis. The "Levels" table also provides the minimum and maximum values for each contour along the direction of the data variable. By default, LEVELS=(1, 5, 10, 50, 90, 95, 99, 100).
-
METHOD=SJPI | SNR | SNRQ | SROT | OS
-
specifies the method for computing the bandwidth. You can specify the following values:
- SJPI
Sheather-Jones plug-in method
- SNR
simple normal reference method
- SNRQ
simple normal reference method using interquartile range
- SROT
Silverman’s rule of thumb method
- OS
oversmoothed method
For a description of these methods, see the section Bandwidth Selection and Jones, Marron, and Sheather (1996). By default, METHOD=SJPI.
Note: The BW= option takes precedence over the METHOD= option. If both are specified, the METHOD= option is ignored.
-
NGRID=number
NG=number
specifies a number of grid points to use for each kernel density estimate.
To specify different numbers of grid points for different variables, specify NGRID=number as a v-option. By default, NGRID=401.
-
NOPRINT
suppresses output tables. You can use
this option when you want to produce only graphical output.
-
OUT=SAS-data-set
-
names the output SAS data set to contain the kernel density
estimate. This output data set contains the following variables:
var, whose value is the name of the variable in the kernel density estimate
value, whose value corresponds to grid coordinates for the variable
density, whose values are equal to the kernel density estimate at the associated grid point
count, whose values indicate the number of original observations contained in the bin that corresponds to a grid point
distribution, whose values are equal to the distribution estimate at the associated grid point (appears only when the CDF global option is specified)
-
PERCENTILES
PERCENTILES=numlist
produces a table of percentiles for each UNIVAR variable. You can specify a list of percentiles to be computed in numlist. The default percentiles are 0.5, 1, 2.5, 5, 10, 25, 50, 75, 90, 95, 97.5, 99, and 99.5.
-
PLOTS=(plot-request<(options)> <…plot-request <(options)>>)
-
specifies which plots of the univariate kernel density estimate to produce.
When you specify only one plot-request, you can omit the parentheses around the plot-request.
ODS Graphics must be enabled before plots can be requested. For example:
ods graphics on;
proc kde data=channel;
univar length / plots=histdensity;
run;
ods graphics off;
For more information about enabling and disabling ODS Graphics, see the section Enabling and Disabling ODS Graphics in Chapter 24, Statistical Graphics Using ODS.
You can specify the following plot-requests, each of which (except for DENSITYOVERLAY) produces a separate plot for every variable listed in the UNIVAR statement:
- ALL
produces all plots.
- DENSITY
produces the univariate kernel density estimate curve.
- DENSITYOVERLAY
produces the overlaid univariate kernel density estimate curves. If you specify more than one variable in the UNIVAR statement, PROC KDE overlays the density curves for all the variables on a single plot.
- HISTDENSITY
produces the univariate histogram of data overlaid with kernel density estimate curve.
- HISTOGRAM
produces the univariate histogram of data.
- NONE
suppresses all plots.
By default, if ODS Graphics is enabled and you do not specify the PLOTS= option, then the UNIVAR statement creates a histogram overlaid with a kernel density estimate. If you specify the PLOTS= option, only the requested plots are created.
-
SJPIMAX=number
specifies the maximum grid value in determining the Sheather-Jones plug-in bandwidth. The default value is two times the oversmoothed estimate.
-
SJPIMIN=number
specifies the minimum grid value in determining the Sheather-Jones plug-in bandwidth. The default value is the maximum value divided by 18.
-
SJPINUM=number
specifies the number of grid values to use in determining the Sheather-Jones plug-in bandwidth. By default, SJPINUM=21.
-
SJPITOL=number
specifies the tolerance for terminating the bisection algorithm that is used in computing the Sheather-Jones plug-in bandwidth. By default, SJPITOL=0.001.
-
TRUNCATE
-
sets the lower grid limit for
each variable to the minimum observed for that variable, and sets the upper grid limit for each variable to the maximum observed value for that variable.
Note: The GRIDL and GRIDU options take precedence over the TRUNCATE option. If one or both are specified, the corresponding lower and upper grid limits are set accordingly.
-
UNISTATS
produces, for each variable a table that contains standard univariate
statistics and the bandwidth that are used to compute its kernel density estimate. The statistics listed are the mean, variance, standard deviation, range, and interquartile range.