Sparse Data Rule (sparse_data) โ
Sparse data calculates mean concentrations for sparse sampling designs. User specified grouping variables, time variable, and concentration variable. This function then calculates the mean and standard deviation at each time point. This data can now be used for PK analysis. Note that this rule must be the last rule in a data cleaning pipeline.
| Field | Description | Required |
|---|---|---|
groupingColumns | Array of columns to be used to group the data for summarization | โ Yes |
timeColumn | Column that includes the time variable | โ Yes |
concentrationColumn | Column that includes the concentration data for summarization. Should be a column with only numeric values. | โ Yes |
groupIdColumnName | New column in output that will contain group id numbers | โ Yes |
carryAlongColumns | Columns for which data will be carried into the output file. Only the first value in each profile will be used | โ No |
uniqueId | Column that contains unique subject identifiers | โ Yes |
includedIdsColumnName | New column in output that will contain unique subject identifiers used in the summarization | โ Yes |
Example: โ
json
{
"description": "Sparse Data Description",
"version": "3.0.0",
"groupingColumns": [
"sex",
"dose"
],
"timeColumn": "time",
"concentrationColumn": "conc",
"groupIdColumnName": "group_id",
"carryAlongColumns": [
"dose"
],
"uniqueId": "subject",
"includedIdsColumnName": "subjects_included",
"type": "sparse_data"
}Behavior: Concentration data in conc will be summarized by dose and sex at each timepoint in time. The new group identifiers will be in group_id, and included subjects for each summary time point will be in subjects_included.