insheet using data.csv, clear note: data imported into stata TS note: Reference study is {{reference_study}} label drop _all * do this in case all usernames were numeric and this got imported as numeric by accident tostring username relates_to_username, replace * encode key variables tostring condition study, replace sencode condition, replace sencode study, replace label variable observation_index "Observation # for script" note observation_index: Scripts can create multiple observations, so this indexes them by date. * Format study dates capture: generate double due_date = clock(due,"YMD hms#") capture: generate double finished_date = clock(finished, "YMD#hms#") capture: generate double started_date = clock(started, "YMD#hms#") capture: generate double randomised_date = clock(randomised, "YMD#hms#") capture: tostring collected_on, replace capture: generate double collected_on_date = clock(collected_on,"YMD#hms#") generate double collected = finished_date replace collected = collected_on_date if collected_on_date != . label variable collected "Date" note collected: Definitive date for this datapoint drop due drop finished drop started drop randomised drop collected_on_date format due_date %tc format started_date %tc format randomised_date %tc format collected %tc capture: label variable randomised_date "Randomisation date" capture: note randomised_date: "Date user was randomised to THIS study (may not be the reference study)." capture: label variable due_date "Date scheduled" capture: note due_date: This is the date the observation was due to occur capture: label variable finished_date "Date captured" capture: note finished_date: This is when the data were captured by the system capture { // calculate days in study, days in reference study bysort username study: egen double _dayzero = min(randomised_date) if is_reference_study == 1 bysort username (_dayzero): gen double dayzero = _dayzero[1] drop _dayzero format dayzero %tc gen double day_in_study = floor( (collected - randomised_date)/1000/60/60/24 ) gen double day_in_trial = floor( (collected - dayzero) /1000/60/60/24 ) label variable dayzero "Date first entry" note dayzero: "Date at which user was randomised to the REFERENCE study (as defined when the data was exported)." label variable day_in_trial "Day" note day_in_trial: "Days offset from the date the user entered the REFERENCE study (see dayzero)" label variable day_in_study "Day in substudy" note day_in_study: "Days offset from the date the user entered THIS study" } // label other vars label variable is_canonical_reply "The definitive reply for this observation" // SETUP VARIABLE AND VALUE LABELS {% for q in questions %} // ----------------------------------------------------------------------------- //{{q.variable_name|safe }} {{ q.set_format|safe }} {{ q.label_variable|safe }} {{ q.label_choices|safe }} {% endfor %} compress order _all, sequential order username dayzero study condition script script observation_index is_canonical_reply due_date collected day_in* , first saveold data, replace export excel data_values.xlsx, replace firstrow(var) nolabel export excel data_labels.xlsx, replace firstrow(var)