r - How to select range of year in uncleaned data in data.table? -


some of data in format:

                     year  persons 1:                   2014       69 2:                   2013       76 3:     2013 couldn't come        3 4:                   2012       48 5:                   2011       57 6:                               1 

as can see, data in column year not clean. when want select rows year 2011 2014, following code works:

df[year %in% c("2014", "2013", "2012", "2011") ] 

select range of year:

df[year >= 2011 , year <= 2014] # won't filter out row `2013 couldn't come`. 

if select regular year, (get rid of year other text, , empty year), guess can use regular expression:

df[ year == '[0-9]{4}',]    # doesn't work. 

however, doesn't work. how use regular expression in data.table?

  1. select range of year;
  2. filter out untidy years.

or just single string operation if want #1 & #2 , not clean data:

dat[grepl("^201[1-4]$", year)] 

Comments

Popular posts from this blog

java - Date formats difference between yyyy-MM-dd'T'HH:mm:ss and yyyy-MM-dd'T'HH:mm:ssXXX -

c# - Get rid of xmlns attribute when adding node to existing xml -