Report all data, even if it fails tests?

Discussion of meteorological data.

Moderators: Bonnie.Jonkman, Andy.Clifton

Should we calculate all data and provide flags, even if we know results are likely to be rubbish?

yes, but provide flags that show quality
2
40%
no, where there is less than 95% data just give NaN or similar
3
60%
 
Total votes: 5

Andy.Clifton
Posts: 83
Joined: Wed Feb 29, 2012 3:13 pm
Organization: NREL
Location: Boulder, CO
Contact:

Report all data, even if it fails tests?

Postby Andy.Clifton » Mon May 20, 2013 1:48 pm

A recent thread (https://wind.nrel.gov/forum/wind/viewtopic.php?f=31&t=822) raises an interesting question.

Even though I know the data might be wrong, should all calculations be done anyway?

Let me explain using an example: the sonic anemometers measure at 20 Hz. Sometimes those sonics can't give acquire 12,000 valid signals per 10-minute interval. To mitigate this, I have a quality threshold that I can set that defines the minimum number of samples that are required for me to "rotate" the sonics to calculate things like fluxes (for example, w'T'). Currently, if the amount of data falls below 0.95, I don't bother to calculate anything requiring rotation of the sonics as they are likely junk.

Assuming the code works, I would instead flag these data and leave it up to the user to apply their own filters to decide what is "good" or "bad" data.
Andy Clifton, Ph.D.
Senior Engineer

Everett.Perry
Posts: 33
Joined: Tue Jan 29, 2013 9:53 am
Organization: Texas Tech University
Location: Oregon

Re: Report all data, even if it fails tests?

Postby Everett.Perry » Mon May 27, 2013 11:04 am

I am using the rotated data for my work. I am also using the 20Hz data extensively. For my purposes it is better to have data that does not meet the 95% rotation threshold listed as NaN or similar (which is how I voted in the poll).

However, I could scan through “quality” flags and determine the 95% threshold for myself. In other words, I can use the data either way with no loss of continuity. I just wanted to mention that in case it is a close poll.
Everett Perry
PhD Candidate, National Wind Institute
Texas Tech University

Jennifer.Rinker
Posts: 21
Joined: Tue Jun 25, 2013 11:34 am
Organization: Duke University
Location: NC, USA

Re: Report all data, even if it fails tests?

Postby Jennifer.Rinker » Fri Jul 12, 2013 11:38 am

I voted "no" simply because I'm worried that the change would be implemented without me noticing, and then I would be analyzing data that weren't very good. As Everett mentioned, if I knew about the change I could simply check the quality flag and ditch the bad data myself. Another solution would be to place the "bad" data in a separate directory than the "good" data, but I'm not sure that would be easy to implement with the current data structures.
Jenni Rinker, Ph.D.
Mechanical Engineering & Materials Science
Duke University/NWTC

Andy.Clifton
Posts: 83
Joined: Wed Feb 29, 2012 3:13 pm
Organization: NREL
Location: Boulder, CO
Contact:

Re: Report all data, even if it fails tests?

Postby Andy.Clifton » Thu Aug 15, 2013 10:47 am

Jenni, now that you've been using the data for a while and (hopefully) understand better how it works and what is in there, do you still think that you might not notice new data, or not use the QC flags to check if the data are valid? I've held off making this change, but I'm thinking it might be helpful still.
Andy Clifton, Ph.D.
Senior Engineer

Jennifer.Rinker
Posts: 21
Joined: Tue Jun 25, 2013 11:34 am
Organization: Duke University
Location: NC, USA

Re: Report all data, even if it fails tests?

Postby Jennifer.Rinker » Fri Nov 01, 2013 9:48 am

Yep, I would be fine now with providing "bad" data but flagging it with a QC. My only request would be to keep some updated document of the meanings of the QC codes for each instrument, since they might be a bit fluid over time. Additionally, if it would be easy to put the QC codes in the 20 Hz structures in addition to the 10-minute structures, that would be a huge benefit to me.

Also, apologies for the huge delay in response. I assumed that if you replied to a thread you were auto-subscribed. :P
Jenni Rinker, Ph.D.
Mechanical Engineering & Materials Science
Duke University/NWTC

Andy.Clifton
Posts: 83
Joined: Wed Feb 29, 2012 3:13 pm
Organization: NREL
Location: Boulder, CO
Contact:

Re: Report all data, even if it fails tests?

Postby Andy.Clifton » Mon Nov 18, 2013 2:39 pm

Believe it or not, the QC codes are constant over time. The basic QC code hasn't changed for over a year, mainly because I figured that people would not want to deal with codes changing. The basic concept is set out in the unofficial guide (p. 15), and repeated here for interest. N.B. I've updated this file today to add the reason for code 1002.

– QC codes indicating that data are ‘flagged’ (possibly bad) are in the range 1000 to 4999. Reasons for flagging channels include:
* 1001 irregular timing. The period between measurements should be 0.05 seconds at a data acquisi- tion rate of 20 Hz. If more than 1% of data are more than 5% from the ideal period, this QC code is set.
* 1002 insufficient data in the wind speed time series
* 1003, 1004 If the number of points within the manufacturer’s limits or users’ limits is below a threshold set in the configuration file. These threshold values are the range rate (QC code 1003) and the accept rate (QC code 1004).
* 1006 if the standard deviation drops below 0.01% of the mean and so a channel is assumed to have a constant value during the measurement interval.
* 20nn if a channel is flagged because it is linked with another channel that has been flagged, where nn is the number of the channel that was flagged.
– QC codes indicating that channels or data have failed are greater than 5000. Reasons for marking chan- nels as failed include:
* 5001 if a channel is empty.
* 5002 if all data in a channel have known ‘bad’ values, e.g. -999. * 5003 if all data in a channel are not-a-number (NaN).
15
* 5004 if the boom speed exceeds 0.1 m/s at any time during the 10 minute interval.
* 5005 if the channel is affected by a known outage.
* 60nn if a channel fails because it is linked with another channel that has failed, where nn is the number of the channel that failed.


I will look into adding some of this to the 20-Hz data as well. Did you dig around in the structures to see what was already there?
Andy Clifton, Ph.D.
Senior Engineer

Jennifer.Rinker
Posts: 21
Joined: Tue Jun 25, 2013 11:34 am
Organization: Duke University
Location: NC, USA

Re: Report all data, even if it fails tests?

Postby Jennifer.Rinker » Tue Nov 19, 2013 10:11 am

I've looked at the raw and cleaned/rotated sonic structures and in the tower structure in the 20 Hz .mat files from M4. The sonic structures have four fields (val, label, units, height) and the tower structure has several fields that are related to the sonics, but neither of them have any "flag" field or anything that I could recognize as QC codes. It's possible I'm looking in the wrong place, though.
Jenni Rinker, Ph.D.
Mechanical Engineering & Materials Science
Duke University/NWTC

Andy.Clifton
Posts: 83
Joined: Wed Feb 29, 2012 3:13 pm
Organization: NREL
Location: Boulder, CO
Contact:

Re: Report all data, even if it fails tests?

Postby Andy.Clifton » Tue Nov 19, 2013 10:21 am

It seems I didn't add information on the QC state of each piece of 20-Hz data. I'll look into it, but it will double the file size and will require that I rewrite almost all of the codes. Not a high priority, I'm afraid.
Andy Clifton, Ph.D.
Senior Engineer


Return to “NWTC Wind Data”

Who is online

Users browsing this forum: No registered users and 1 guest