Please send any question or feedback to: cedars@sound-data.com ------------------------------------------------------------------ Table of Contents ------------------------------------------------------------------ Introduction Specification notes Data file preparation Data submissions Data change management QC Tiers Tier 1 Tier 2 Tier 3 Source of Truth files claim_spec validation_rules warning_rules CEDARS cost equations CEDARS savings equations First year gross First year net Lifecycle gross Lifecycle net ------------------------------------------------------------------ Specification notes ------------------------------------------------------------------ Cedars requirements are captured in three source of truth files: claim_spec.sql, validation_rules.csv, and warning_rules.csv. These files define: 1. The table specifications (table names, field names, data types, primary and foreign keys) 2. The QC rules for claims data being submitted to Cedars. QC rules are broken out into categories: a. Single field QC rules: found in claim_spec.sql b. QC rules that require evaluation of more than on field are found in validation_rules.csv and warning_rules.csv. These rules are split into two categories: i. Rules: if a record does not comply with the rule it is rejected ii. Warnings: if a record triggers a warning the data are accepted and the warning is provided back to the PA for review. Some notes about these source of truth files: 1. Additionally, there is a 'claims_metadata.csv' file that explains the semantics attached to each field in each of the files in a claims upload. 2. The three source of truth files support three tiers of QC; the tiers are additive and evaluated in sequence. 3. Being a primary key constrains a field to be both not null and unique. 4. If you are resubmitting data for a ClaimID that already exists in CEDARS, and your new data for that ClaimID passes QC, CEDARS will overwrite the data which already exists for that ClaimID. ------------------------------------------------------------------ Data file preparation ------------------------------------------------------------------ For the purpose of this section, an 'upload' is both the file format for claiming under CEDARS, and the act of uploading one of such files. * Claim data uploads consist of seven .csv data tables as defined in the spec. * These seven .csv data tables will be zipped together, in a file name of PA choosing, into a .zip archive. A Readme.txt may optionally be included in the .zip archive. * CEDARS requires data files be prepared using the pipe or up-bar character "|" as the delimiter with unquoted text values. * Files must contain the correct field names, using the specified casing (e.g. CamelCaps), and fields must be in the correct order. * All seven data files must be uploaded in each .zip upload. * The maximum precision allowed is 15 digits past the decimal. ------------------------------------------------------------------ Data submissions ------------------------------------------------------------------ For the purpose of this section, a 'submission' is a quarterly or yearly submission period for CEDARS claims/costs. Each submission can be composed of one or more uploads as defined in the previous section. Data submissions are confirmed for each PA by a PA admin five times a year: * once per quarter, all year-to-date uploads are confirmed, with a focus on reviewing the current quarter's uploads. * once a year, at the end of year, the yearly submission is confirmed. Confirmation of a submission period (quarterly or yearly) requires certain conditions: * A quarterly submission can be confirmed only if it has at least one completed (validated and accepted) upload for the present quarter. * A quarterly submission can't be confirmed if it contains ProgramCosts for a future quarter. Example: the confirmation button for the 2016Q3 submission won't be available if CEDARS contains ProgramCost records for the 2016Q4 quarter. This can happen in the case of reopened submission periods (more on this below). * A yearly submission can be confirmed even if no uploads were made in the current yearly review period. This can happen if a review confirms that uploads for Q1 through Q4 were correct. * A quarterly submission will not be allowed to be confirmed if there exist ProgramIDs in Claims for which there isn't a corresponding ProgramCost record for that ProgramID. This is a CET rule. * A quarterly submission will not be allowed to be confirmed if there exist ProgramIDs in the ProgramCost table for which there isn't a corresponding Claim for that ProgramID. Again, this is a CET rule. During review, CPUC might find that a confirmed submission period has issues that need fixing. In this case, a submission period can be re-opened by a CPUC ED. Here are the conditions under which the CEDARs system will enable the re-opening of a submission period: * The submission period is the latest confirmed. CEDARS will allow re-opening Q3 if the currently active submission period is Q4, but it won't allow re-opening Q2 if Q3 has also been confirmed. * The current submission period hasn't been re-opened. In the previous case, if a CPUC ED reviews a confirmed 2016Q3 and re-opens it, the 2016Q3 submission period will move from "confirmed" to "re-opened". Re-opening 2016Q2 won't be possible. In short, it's only possible to go back only one period, not two or more. ------------------------------------------------------------------ QC Tier 1: High level specification compliance ------------------------------------------------------------------ Table level tests 1. Do the data have the right table and field names? 2. Do the data comply with the table primary key requirements? 3. Is there one record in CustomMeasure or DeemedMeasure for every claim? 4. Is there at most one record in WaterMeasure for every claim? If the answer to any question is no, the entire data submission is refused. Record level tests 1. Do the data have the correct data types? If the answer is no, the record is refused. 2. Are the data free from line feeds (LF) and carriage returns (CR) between records? If the answer is no, the LF and CR are stripped and the data are accepted. 3. Are the data free from non-ASCII characters? If the answer is no, the whole record is rejected. ------------------------------------------------------------------ QC Tier 2: Cross table validation ------------------------------------------------------------------ We use the SQL syntax of TableName.FieldName to explicitly specify the table and field being compared. These rules are evaluated at the record level. 1. Is ContactClaim.ClaimID present in Claim.ClaimID? If the answer is no, reject the ContactClaim record. 2. Is Claim.SiteID present in Site.SiteID? If the answer is no, reject the Claim record. 3. Is coalesce(CustomMeasure.ClaimID, DeemedMeasure.ClaimID) present in Claim.ClaimID? If the answer is no, reject the DeemedMeasure or CustomMeasure record. 4. Is the same ClaimID present in both DeemedMeasure and CustomMeasure? If the answer is yes, reject both DeemedMeasure and CustomMeasure records. 5. Is Claim.ClaimID present in coalesce(CustomMeasure.ClaimID, DeemedMeasure.ClaimID)? If the answer is no, reject the Claim record. 6. Is WaterMeasure.ClaimID present in Claim.ClaimID? If the answer is no, reject the WaterMeasure record. In addition, the following cross table validation rules are enforced as Tier 3 field rules for implementation reasons: 7. Is Claim.ClaimID present in ContactClaim.ClaimID for all claims except where Claim.DeliveryType = ‘C&S’? If the answer is no, reject the Claim record. 8. Is Claim.ClaimID present in WaterMeasure.ClaimID for all claims where Claim.WaterOnlyFlag = 1? If the answer is no, reject the Claim record. ------------------------------------------------------------------ QC Tier 3: Field rules ------------------------------------------------------------------ In Tier 3 we perform record-level validation of claims using the CEDARS source of truth (SoT) files. Tier 3 has two categories of validation: single field and multiple field. Single field validations are in the claim_spec.sql file. Multi-field rules are in the validation_rules.csv and warnings_rules.csv files. Please see the information below on the Source of Truth files. ------------------------------------------------------------------ Source of Truth File: claim_spec.sql ------------------------------------------------------------------ The claim_spec file is written as an enhanced SQL create table script that includes all single field constraints, table definitions, field names, and data types. We are calling our enhanced SQL dialect SQLPlus. Our SQLPlus dialect includes the usual data types (i.e. bit, nvarchar, numeric), plus some extra types and modifiers: CedarsDate: A date field that is validated to confirm that the given data parses to a date between nine years before present and the date of the submission. WarningCedarsDate: Date constraints that raise warnings instead of causing the records to be rejected. Uses the same rules as CedarsDate, that is, the date must be within nine years of date of submission. CedarsYearQuarter: A six character text field which is validated to confirm that the given data parses to a quarter of the form 2015Q3, is within five years before the date of the submission and is not in the future. Note that, for Claim uploads, "present" means the year and quarter of the PA's currently active submission, not the calendar year and quarter. Therefore, if a PA is uploading records for the 2016Q3 submission in early January 2017, the "date of the submission" is the current submission's quarter: 2016Q3 CedarsFundingCycle: The funding cycle of the claim, must be a prior funding cycle or the current program cycle. Example values: '2010-2012', '2013-2015', '2016' CedarsNumRange: Numeric types that are constrained. Square brackets are inclusive [greater/less than or equal to], parenthesis are exclusive (greater/less than). That is: CedarsNumRange (0,1) means greater than zero and less than one CedarsNumRange [0,1] means greater than or equal to zero and less than or equal to one CedarsNumRange (0,1] means greater than zero and less than or equal to one WarningCedarsNumRange: Numeric constraints that raise warnings instead of causing the records to be rejected. Uses the same notation conventions as CedarsNumRange. CedarsValueList: Text types that are constrained to certain value lists as specified by Commission Staff. MarginsOfWarningAndError: Compares uploaded values of savings and costs to ex-ante calculated values. This takes a tuple of values with the percentage that uploaded values will be allowed to deviate from the ex-ante calculation. It generates one warning validator and one error validator. We'll use the following example: MarginsOfWarningAndError (0.01, 0.1) Passing with no errors nor warning: The uploaded value will be accepted without either warning or error if the difference with the ex-ante value is lower than 0.01% Passing, with a warning: The uploaded value will be accepted, but a warning will be issued, if the difference with the ex-ante value is between 0.01 and 0.1% Rejected, with an error: The uploaded value will be rejected, and an error will be issued, if the difference with the ex-ante value is higher than 0.1% NotEmpty: usually in final position, it means the field is never allowed to be empty. DefaultFalse: used only for boolean flag, usually in final position, it means providing this flag is optional, but the value of the field will be autofilled with False. StartsWith: used with textual fields, compares two fields in the same table. This works as in English: `ClaimID startswith PA` is true for "PGE-3" and "PGE" Matches: used with textual fields, compares two fields in the same table. This works as in English: `PA matches UploadPA` is true for "SDGE" and "SDGE" ------------------------------------------------------------------ Source of Truth File: validation_rules.csv ------------------------------------------------------------------ The validation_rules file provides are rules that, if violated, will cause the claim record to be rejected. Data that fail validation rules are rejected on a record-level basis. The QC rules in this file require more than one field to evaluate. Some rules are conditionally applied to the data; conditional rules will have one or more set of where clause columns populated in the csv. Validation rules have up to two where clauses; warnings have up to five where clauses. The rules used to specify validation conditions: is null or zero is not null less than less than or equal to greater than greater than or equal to equals does not equal is_california_zipcode combo_value_list matches_year_quarter All those are self-explanatory, except the last two: - a 'combo_value_list' rule means that all the values in the listed fields in a given record must be an existing combination in the corresponding combo according to the cedars specification. - a 'matches_year_quarter' rule is used to ensure that a given ProgramCost's PrgYear matches its ProgramYearQuarter. Additionally, because every claim record has one corresponding record in either the CustomMeasure or DeemedMeasure table, in the table field in the spreadsheet we use the SQL syntax: coalesce (DeemedMeasure, CustomMeasure) for rules that require the claim table to be compared to either the CustomMeasure or DeemedMeasure table - whichever is populated for that claim. In SQL, coalesce means to take the first not null value from the options listed in the parenthesis. All rules define the expected state of the data; that is, when data do not comply with the requirement specified, the data are rejected. ------------------------------------------------------------------ Source of Truth File: warnings_rules.csv ------------------------------------------------------------------ The warnings_rules file provides are rules that, if violated, returns a warning to the user and the data are accepted. Data are tested against warnings on record-level basis. Some of these rules are conditionally applied to the data; conditional rules will have one or more set of where clause columns populated in the spreadsheet. Warnings have up to five where clauses. The rules used to specify warning conditions: is null or zero is not null less than less than or equal to greater than greater than or equal to equals does not equal For "coalesce (DeemedMeasure, CustomMeasure)", see above. All rules define the expected state of the data; that is, when data do not comply with the requirement specified by the rule, the data are accepted, but a warning is returned to the user. ------------------------------------------------------------------ Cedars cost equations ------------------------------------------------------------------ TotalGrossIncentive = (UnitEndUserRebate + UnitIncentiveToOthers + UnitDirectInstallLab + UnitDirectInstallMat) * NumUnits TotalGrossMeasureCost = UnitMeaCost1stBaseline * NumUnits TotalGrossMeasureCost_ER = UnitMeaCost2ndBaseline * NumUnits ------------------------------------------------------------------ Cedars savings equations First year, Lifecycle, Gross, Net ------------------------------------------------------------------ --------------------------------------- First year gross --------------------------------------- sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRkW else 1 end * Claim.NumUnits * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRkW else 1 end * Claim.NumUnits * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW2ndBaseline) as TotalFirstYearGrosskW, sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRkWh else 1 end * Claim.NumUnits * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRkWh else 1 end * Claim.NumUnits * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh2ndBaseline) as TotalFirstYearGrosskWh, sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRTherm else 1 end * Claim.NumUnits * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRTherm else 1 end * Claim.NumUnits * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm2ndBaseline) as TotalFirstYearGrossTherm, --------------------------------------- First year net --------------------------------------- sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * Claim.NumUnits * Claim.NTGRkW * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRkW * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW2ndBaseline) as TotalFirstYearNetkW, sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * Claim.NumUnits * Claim.NTGRkWh * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRkWh * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh2ndBaseline) as TotalFirstYearNetkWh, sum (case when Measure.RUL >= 1 then 1 when Measure.RUL > 0 and Measure.RUL < 1 then Measure.RUL when Measure.RUL = 0 and Measure.EUL > 1 then 1 else Measure.EUL end * Claim.NumUnits * Claim.NTGRTherm * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm1stBaseline + case when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL >= 1 then 1 - Measure.RUL when Measure.RUL > 0 and Measure.RUL < 1 and Measure.EUL < 1 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRTherm * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm2ndBaseline) as TotalFirstYearNetTherm, --------------------------------------- Lifecycle gross --------------------------------------- sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRkW else 1 end * Claim.NumUnits * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRkW else 1 end * Claim.NumUnits * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW2ndBaseline ) as TotalLifecycleGrosskW, sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRkWh else 1 end * Claim.NumUnits * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRkWh else 1 end * Claim.NumUnits * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh2ndBaseline ) as TotalLifecycleGrosskWh, sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRTherm else 1 end * Claim.NumUnits * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * case when Claim.DeliveryType = 'C&S' then Claim.NTGRTherm else 1 end * Claim.NumUnits * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm2ndBaseline ) as TotalLifecycleGrossTherm, --------------------------------------- Lifecycle net --------------------------------------- sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * Claim.NumUnits * Claim.NTGRkW * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRkW * Measure.InstallationRatekW * Measure.RealizationRatekW * Measure.UnitkW2ndBaseline ) as TotalLifecycleNetkW, sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * Claim.NumUnits * Claim.NTGRkWh * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRkWh * Measure.InstallationRatekWh * Measure.RealizationRatekWh * Measure.UnitkWh2ndBaseline ) as TotalLifecycleNetkWh, sum (case when Measure.RUL > 0 then Measure.RUL else Measure.EUL end * Claim.NumUnits * Claim.NTGRTherm * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm1stBaseline + case when Measure.RUL > 0 then Measure.EUL - Measure.RUL else 0 end * Claim.NumUnits * Claim.NTGRTherm * Measure.InstallationRateTherm * Measure.RealizationRateTherm * Measure.UnitTherm2ndBaseline ) as TotalLifecycleNetTherm,