DW - CovidCheq (Integration Package)
Last Updated 5/24/2021
Overview
Location
Schedule
Details
Packages
master
create_files
send_archive
cleanup
Output File Defintions
Change Log
Troubleshooting Notes
Overview:
The packages in the DW-CovidCheq project are designed to send data to the Qualtrics directory. It is an integration with the Qualtrics API that allows us to automatically create and updated embedded data for directory contacts in Qualtrics
The ETL has two main components. An SSIS package that creates files for student, employees and faculty, and a package that sends those files to the Qualtrics file service for processing in Qualtrics.
Location:
The project – DW-CovidCheq – contains all the packages for this integration and lives in the ETL-BI-Integrations solution in SSIS and in the BI-Integrations folder in the SSISDB catalog.
Schedule:
The packages in this project run as part of the jobs below (full dataMarq schedule).
ETL – BI-Integration – CovidCheq Full/Inc – Every 1 hr
Project and Package Details:
The packages in ETL – Qualtrics work with the Qualtircs API and don’t operate on our normal one package per table paradigm of other DB related loads. Instead, there are three main packages, one for loading survey data, one for loading/updating users, and one (decommissioned) for loading contact data .
Project Parameters
apiToken: The unique api Token that allows access to the Qualtrics instance and the api methods
automationId: Automations are the processes in the Qualtrics tool that load data files to the contact directory. There is a unique automation set up to process the files produced by this project
cleanupDays: The number of days back to look back and move any files older than this to the archive folder on the BI shared drive
covidFilePath: The parent directory to store the created files on the BI shared drive - \\marqnet.mu.edu\depts\ITS\BI\covidCheq\
curlPath: The path on the appropriate server/local machine to the curl executable. Curl is a tool used to interact with web services.
dwConnStr: The connection to the approprate dataMarq data warehouse instance
fullRefresh: 1 = Full Refresh, 0 = Incremental
gpgPath: gpg is a tool that encrypts the data in the files and is compatible with Qualtrics' encryption. Once files are created by SSIS they are encrypted using a gpg key and then sent to Qualtrics. This encryption is to secure HIPAA data contained in the files. Qualtrics has the gpg key and the tool decodes the file so only encrypted files are sent
gpgRecipient: A name for the key holder to encrypt it with the appropriate encryption
refreshDays: How many days to look back to pull data for incremental refreshes
Packages
MASTER
Overview
This is the master orchestration package.
Package Parameters
None
Package Variables
None
Package Flow
- Run create_files package
- Run send_archive package
- Run cleanup package
CREATE_FILES
Overview
This package creates the files are are sent to Qualtrics.
Package Parameters
None
Package Variables
fileDate: A date set a run time to tag on to the file name to indicate when the file was created
fileName: This package creates three separate files to speed up processing, an Undergrad file (ugrd_), a Grad file (grad_), a visitors file (vis_), and an employee file (emp_). This variable sets the prefix for the file name.
fileTimestamp: A timestamp to tag on to the file name to indicate when the file was created
list: Each file is created using a list variable (ugrd, grad, vis, emp). These lists are stored in an XML file
listId: The unique ID in the XML file for each list/file to be created
selectStatement: The select statement for that generates the data for the file. It is essentially a select statement from the table ops.d_covidcheq_list.
sqlStatement: The full sql statement to be passed to the db source task
whereRefresh: The where clause to be added to the where statement when an incremental job is run.
@[$Project::fullRefresh] == 1 ? "" : " AND DATEDIFF(HOUR,sys_updated,getdate()) < " + @[$Project::refreshDays]
whereStatement: The where clause to be added to the sql statement.
@[User::whereRefresh] + " " + @[User::whereUsers]
whereUsers: The where clause to be added to the where statement to indicate which users to populate for a given file
@[User::whereRefresh] + " " + @[User::whereUsers]
XML Lists Example
<Lists>
<List>
<Id>1</Id>
<Name>ugrd</Name>
</List>
<List>
<Id>2</Id>
<Name>grad</Name>
</List>
</Lists>
Package Flow
Package Snapshots
SEND_ARCHIVE
Overview
This package enrypts and sends the files that are created in the create_files packages to Qualtrics file service for processing by the automated job to populated the contact directory
Package Parameters
None
Package Variables
fileName: the file path to be send to Qualtrics, determined by the file loop of the covidFilePath project parameter
Package Flow
Package Snapshots
CLEANUP
Overview
This package has one execute process task that runs a powershell command to delete files that are older than the cleanupDays parameter.
The command that is run is an expression:
"Get-ChildItem –Path \"" + @[$Project::covidFilePath] + "archive\" -Recurse | Where-Object {($_.LastWriteTime -lt (Get-Date).AddDays("+ @[$Project::cleanupDays] + "))} | Remove-Item"
File Definitions
Field |
Type |
Length |
userid | wstr | 100 |
status_code | wstr | 25 |
status | wstr | 75 |
faculty_ind | wstr | 1 |
staff_ind | wstr | 1 |
student_ind | wstr | 1 |
person_type | wstr | 25 |
employment_category | wstr | 30 |
college_org_code | wstr | 10 |
college | wstr | 50 |
vp_code | wstr | 5 |
vp_area | wstr | 80 |
job_title | wstr | 150 |
primary_position_title | wstr | 100 |
primary_supervisor_muid | wstr | 150 |
primary_supervisor_name | wstr | 240 |
dept_chair_muid | wstr | 150 |
dept_chair_name | wstr | 240 |
chair_job_title | wstr | 150 |
chair_position_title | wstr | 100 |
next_up_supervisor_muid | wstr | 150 |
next_up_supervisor_name | wstr | 240 |
location_code | wstr | 60 |
office_location | wstr | 45 |
office_number | wstr | 45 |
office_building | wstr | 60 |
student_level_code | wstr | 4 |
student_level | wstr | 15 |
acad_load_code | wstr | 1 |
degree_seeking_ind | str | 1 |
acad_career_code | wstr | 5 |
acad_prog_code | wstr | 5 |
primary_acad_plan | wstr | 10 |
acad_plans | wstr | 150 |
res_hall_code | wstr | 1 |
res_hall | wstr | 30 |
acad_career_rep | wstr | 50 |
acad_career_sh | wstr | 50 |
muid | wstr | 11 |
first_name | wstr | 150 |
wstr | 240 | |
cell_phone | wstr | 30 |
pref_phone | wstr | 30 |
trace_ind | wstr | 1 |
local_street_address | wstr | 450 |
local_city | wstr | 100 |
local_state_code | wstr | 10 |
local_postal_code | wstr | 25 |
last_name | wstr | 150 |
student_employee_ind | wstr | 1 |
tenured_ind | wstr | 1 |
first_gen_ind | wstr | 1 |
visitor_ind | wstr | 1 |
gender | wstr | 50 |
race_ethnicity | wstr | 50 |
ed_dsc_exclude | wstr | 1 |
delinquent_ind | str | 1 |
primary_hr_dept_code | wstr | 10 |
primary_hr_dept | wstr | 50 |
primary_acad_dept_code | wstr | 25 |
primary_acad_dept | wstr | 100 |
externalreference | wstr | 11 |
ED_DSC_OPT_OUT | wstr | 1 |
full_time_ind | wstr | 1 |
primary_instructor_ind | wstr | 1 |
currently_attending_ind | wstr | 1 |
athletics_ind | wstr | 1 |
sport_code | wstr | 5 |
sport | wstr | 25 |
degree_type | wstr | 100 |
campus_housing_ind | str | 1 |
attribute1 | wstr | 150 |
attribute2 | wstr | 150 |
attribute3 | wstr | 150 |
attribute4 | wstr | 150 |
attribute5 | wstr | 150 |
WEEKEND_OPT_IN | wstr | 1 |
online_only_ind | wstr | 1 |
quarantine_status | wstr | 50 |
last_90_days_ind | str | 1 |
attribute6 | wstr | 150 |
attribute7 | wstr | 150 |
attribute8 | wstr | 150 |
attribute9 | wstr | 150 |
attribute10 | wstr | 150 |
hall_director_email | wstr | 240 |
current_surveillance_ind | wstr | 1 |
surveillance_appt_ind | wstr | 1 |
surveillance_test_ind | wstr | 1 |
surveillance_delinquent_ind | wstr | 1 |
attribute11 | wstr | 150 |
attribute12 | wstr | 150 |
attribute13 | wstr | 150 |
attribute14 | wstr | 150 |
attribute15 | wstr | 150 |
Change Log:
Date | Developer | Change Record |
5/24/2021 | N/A | Initial Documentation |
Troubleshooting Notes:
-
Occasional Failures Due to Qualtrics API
Calls to the Qualtrics web service will occasionally fail. This is not due to a problem with the code, but a limiting of calls as we are running these quite frequently. The only issue would be if the job fails repeatedly. This has not happened with the stage load, only with the downstream procs -
Changes to old data in a survey (old would be more than the 2 days we refresh)
This mostly happens with the contact tracing survey where the Medical Clinic enters in a close contact – or someone enters a self disclosure – and then finds out it shouldn’t have been. When this happens within two days of the response it is not a problem as the incremental refresh picks up the last two days and will alter the response. If it is older than two days the remedy is to DELETE the staging row with the bad response. This is preferrable to running the survey load job with different start/end dates which can be