Please send questions or feedback to: cedars@sound-data.com ------------------------------------------------------------------ Table of Contents ------------------------------------------------------------------ Introduction Specification notes Data file preparation Data submissions Data change management QC Tiers Tier 1 Tier 2 Tier 3 Source of Truth files claim_spec validation_rules warning_rules CEDARS savings equations First year gross First year net Lifecycle gross Lifecycle net ------------------------------------------------------------------ Specification notes ------------------------------------------------------------------ Cedars requirements are captured in two source of truth files; these files define: 1. The table specifications (table names, field names, data types, primary and foreign keys) 2. The QC rules for claims data being submitted to Cedars. QC rules are broken out into categories: a. Single field QC rules: found in claim_spec.sql b. QC rules that require evaluation of more than on field: found in validation_rules.xlsx. Within this group there are two categories: i. Rules: if a record does not comply with the rule it is rejected ii. Warnings: if a record triggers a warning the data are accepted and the warning is provided back to the PA for review. Some notes about these source of truth files: 1. For operation, the 'validation' and 'warnings' tabs in validation_rules.xlsx are broken out into two machine-readable files called 'validation_rules.csv' and 'validation_warnings.csv'. These are also offered for download. 2. Additionally, there is a 'claims_metadata.csv' file that explains the semantics attached to each field in each of the files in a claims upload. 3. The three source of truth files support three tiers of QC; the tiers are additive and evaluated in sequence. 4. Being a primary key constrains a field to be both not null and unique. 5. If you are submitting data for a ClaimID that already exists in CEDARS, and your new data for that ClaimID passes QC, CEDARS will overwrite the data which already exists for that ClaimID. 6. The terms 'picklist' and 'value list' are used interchangeably in this spec; they mean the same thing. 7. SQL files can be opened and read in any text editor. ------------------------------------------------------------------ Data file preparation ------------------------------------------------------------------ Claim data submissions consist of seven .csv data tables zipped together, in a file name of PA choosing, into a .zip archive. A Readme.txt may optionally be included in the .zip archive. CEDARS requires data files be prepared using the pipe or up-bar character "|" as the delimiter with unquoted text values. Files must contain the correct field names, using the specified casing (e.g. CamelCaps), and fields must be in the correct order. All seven data files must be submitted in each .zip submission. ------------------------------------------------------------------ Data submissions ------------------------------------------------------------------ PAs will begin testing Cedars with their Q3 2015 data; CCA/RENs will begin testing in November 2015; IOUs will begin testing in December. PAs are asked to continue to test Cedars with their remaining 2015 submissions (Q4 and annual). Official 2015 data submissions will continue to be processed by Itron. CEDARS starts off with a clean slate and will only know claims submitted directly by PAs to CEDARS. Official 2016 data submissions will be submitted to Cedars, beginning with Q1 2016; test data submitted by PAs in 2015 will be purged from the CEDARS database prior to the official launch. ------------------------------------------------------------------ Data change management ------------------------------------------------------------------ CEDARS will next consider changes to the specification and QC requirements in February, any revisions will go live in March. Please email change suggestions to cedars@sound-data.com; you will receive a response with a link that will allow you to continue to correspond with the CEDARS team, and may monitor work on your suggestion via the CEDARS ticketing system. ------------------------------------------------------------------ QC Tier 1: High level specification compliance ------------------------------------------------------------------ Table level tests 1. Do the data have the right table and field names? 2. Do the data comply with the table primary key requirements? 3. Is there one record in CustomMeasure or DeemedMeasure for every claim? If the answer to any question is no, the entire data submission is refused. Record level tests 1. Do the data have the correct data types? If the answer is no, the record is refused. 2. Are the data free from line feeds (LF) and carriage returns (CR) between records? If the answer is no, the LF and CR are stripped and the data are accepted. ------------------------------------------------------------------ QC Tier 2: Cross table validation ------------------------------------------------------------------ We use the SQL syntax of TableName.FieldName to explicitly specify the table and field being compared. These rules are evaluated at the record level. 1. Is ContactClaim.ClaimID present in Claim.ClaimID? If the answer is no, reject the ContactClaim record. 2. Is ContactClaim.ContactID present in Contact.ContactID? If the answer is no, reject the ContactClaim record. 3. Is Claim.SiteID present in Site.SiteID? If the answer is no, reject the Claim record. 4. Is coalesce(CustomMeasure.ClaimID, DeemedMeasure.ClaimID) present in Claim.ClaimID? If the answer is no, reject the DeemedMeasure or CustomMeasure record. 5. Is the same ClaimID present in both DeemedMeasure and CustomMeasure? If the answer is yes, reject both DeemedMeasure and CustomMeasure records. 6. Is Claim.ClaimID present in coalesce(CustomMeasure.ClaimID, DeemedMeasure.ClaimID)? If the answer is no, reject the Claim record. ------------------------------------------------------------------ QC Tier 3: Field rules ------------------------------------------------------------------ In Tier 3 we perform record-level validation of claims using the CEDARS source of truth files. Tier 3 has two categories of validation: single field and multiple field. Single field validations are in the claim_spec.sql file. Multi-filed rules are in the validation_rules.csv and warnings_rules.csv files. Please see the information below on the Source of Truth files. ------------------------------------------------------------------ Source of Truth File: claim_spec.sql ------------------------------------------------------------------ The claim_spec file is written as an enhanced SQL create table script that includes all single field constraints, table definitions, field names, and data types. We are calling our enhanced SQL dialect SQLPlus. Our SQLPlus dialect includes the usual data types (i.e. bit, nvarchar, numeric), plus some extra ones: CedarsDate: A date field that is validated to confirm that the given data parses to a date between nine years before present and the date of the submission. WarningCedarsDate: Date constraints that raise warnings instead of causing the records to be rejected. Uses the same rules as CedarsDate, that is, the date must be within nine years of date of submission. CedarsYearQuarter: A six character text field which is validated to confirm that the given data parses to a quarter of the form 2015Q3, is within five years before the date of the submission and is not in the future. CedarsFundingCycle: The funding cycle of the claim, must be a prior funding cycle or the current program cycle. Example values: '2010-2012', '2013-2015', '2016' CedarsNumRange: Numeric types that are constrained. Square brackets are inclusive [greater/less than or equal to], parenthesis are exclusive (greater/less than). That is: CedarsNumRange (0,1) means greater than zero and less than one CedarsNumRange [0,1] means greater than or equal to zero and less than or equal to one CedarsNumRange (0,1] means greater than zero and less than or equal to one WarningCedarsNumRange: Numeric constraints that raise warnings instead of causing the records to be rejected. Uses the same notation conventions as CedarsNumRange. CedarsValueList: Text types that are constrained to certain value lists as specified by Commission Staff. NotEmpty: usually in final position, it means the field is never allowed to be empty. DefaultFalse: used only for boolean flag, usually in final position, it means providing this flag is optional, but the value of the field will be autofilled with False. Additionally, because every claim record has one corresponding record in either the CustomMeasure or DeemedMeasure table, in the table field in the spreadsheet we use the SQL syntax: coalesce (DeemedMeasure, CustomMeasure) for rules that require the claim table to be compared to either the CustomMeasure or DeemedMeasure table - whichever is populated for that claim. In SQL, coalesce means to take the first not null value from the options listed in the parenthesis. ------------------------------------------------------------------ Source of Truth File: validation_rules.csv ------------------------------------------------------------------ The validation_rules file provides are rules that, if violated, will cause the claim record to be rejected. Data that fail validation rules are rejected on a record-level basis. The QC rules in this file require more than one field to evaluate. Some rules are conditionally applied to the data; conditional rules will have one or more set of where clause columns populated in the csv. Validation rules have up to two where clauses; warnings have up to five where clauses. The rules used to specify validation conditions: is null or zero is not null less than less than or equal to greater than greater than or equal to equals does not equal is_california_zipcode combo_value_list matches_year_quarter All those are self-explanatory, except the last two: - a 'combo_value_list' rule means that all the values in the listed fields in a given record must be an existing combination in the corresponding combo according to the cedars specification. - a 'matches_year_quarter' rule is used to ensure that a given ProgramCost's PrgYear matches its ProgramYearQuarter. All rules define the expected state of the data; that is, when data do not comply with the requirement specified, the data are rejected. ------------------------------------------------------------------ Source of Truth File: warnings_rules.csv ------------------------------------------------------------------ The warnings_rules file provides are rules that, if violated, returns a warning to the user and the data are accepted. Data are tested against warnings on record-level basis. Some of these rules are conditionally applied to the data; conditional rules will have one or more set of where clause columns populated in the spreadsheet. Warnings have up to five where clauses. The rules used to specify warning conditions: is null or zero is not null less than less than or equal to greater than greater than or equal to equals does not equal All rules define the expected state of the data; that is, when data do not comply with the requirement specified by the rule, the data are accepted, but a warning is returned to the user. ------------------------------------------------------------------ Cedars savings equations First year, Lifecycle, Gross, Net ------------------------------------------------------------------ --------------------------------------- First year gross --------------------------------------- sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * Claim.NumUnits * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW2ndBaseline) as TotalFirstYearGrosskW, sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * Claim.NumUnits * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh2ndBaseline) as TotalFirstYearGrosskWh, sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * Claim.NumUnits * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm2ndBaseline) as TotalFirstYearGrossTherm, --------------------------------------- First year net --------------------------------------- sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * Claim.NumUnits * Claim.NTGRkW * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRkW * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW2ndBaseline) as TotalFirstYearNetkW, sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * Claim.NumUnits * Claim.NTGRkWh * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRkWh * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh2ndBaseline) as TotalFirstYearNetkWh, sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * Claim.NumUnits * Claim.NTGRTherm * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRTherm * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm2ndBaseline) as TotalFirstYearNetTherm, --------------------------------------- Lifecycle gross --------------------------------------- sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * Claim.NumUnits * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW2ndBaseline ) as TotalLifecycleGrosskW, sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * Claim.NumUnits * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh2ndBaseline ) as TotalLifecycleGrosskWh, sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * Claim.NumUnits * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm2ndBaseline ) as TotalLifecycleGrossTherm, --------------------------------------- Lifecycle net --------------------------------------- sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * Claim.NumUnits * Claim.NTGRkW * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRkW * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW2ndBaseline ) as TotalLifecycleNetkW, sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * Claim.NumUnits * Claim.NTGRkWh * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRkWh * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh2ndBaseline ) as TotalLifecycleNetkWh, sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * Claim.NumUnits * Claim.NTGRTherm * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRTherm * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm2ndBaseline ) as TotalLifecycleNetTherm,