diff --git a/NEWS.md b/NEWS.md
index 610f678e31..3ca44e1dc8 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -10,6 +10,10 @@
 
 3. The number of logical CPUs used by default has been reduced from 100% to 50%. The previous 100% default was reported to cause significant slow downs when other non-trivial processes were also running: [#3395](https://github.com/Rdatatable/data.table/issues/3395), [#3298](https://github.com/Rdatatable/data.table/issues/3298). Two new optional environment variables (`R_DATATABLE_NUM_PROCS_PERCENT` & `R_DATATABLE_NUM_THREADS`) control this default. \code(setDTthreads()) gains \code{percent=} and \code{?setDTthreads} has been significantly revised. \code{getDTthreads(verbose=TRUE)} has been expanded. The environment variable `OMP_THREAD_LIMIT` is now respected ([#3300](https://github.com/Rdatatable/data.table/issues/3300)) in addition to `OMP_NUM_THREADS` as before.
 
+4. `rbind` and `rbindlist` now retain the position of duplicate column names rather than grouping them together [#3373](https://github.com/Rdatatable/data.table/issues/3373), fill length 0 columns (including NULL) with NA with warning [#1871](https://github.com/Rdatatable/data.table/issues/1871), and recycle length-1 columns [#524](https://github.com/Rdatatable/data.table/issues/524). Thanks to Kun Ren for the requests which arose when parsing JSON.
+
+5. `rbindlist`'s `use.names=` default has changed from `FALSE` to `"check"`. This warns if the column names of each item are not identical and then proceeds as if `use.names=FALSE` for backwards compatibility; i.e., bind by column number not by column name. In future, it will warn and then proceed as if `use.names=TRUE`. Eventually the default will be changed from `NA` to `TRUE` unless user feedback is negative. The `rbind` method for `data.table` already sets `use.names=TRUE` as does `rbind` for `data.frame` in base, and is clearly safer. To stack differently named columns together silently (the previous default behavior), it is now necessary to write `use.names=FALSE` for clarity to readers of your code. Thanks to Clayton Stanley who first raised the issue [here](http://lists.r-forge.r-project.org/pipermail/datatable-help/2014-April/002480.html).
+
 #### BUG FIXES
 
 1. `rbindlist()` of a malformed factor missing levels attribute is now a helpful error rather than a cryptic error about `STRING_ELT`, [#3315](https://github.com/Rdatatable/data.table/issues/3315). Thanks to Michael Chirico for reporting.
@@ -34,6 +38,14 @@
 
 11. A join's result could be incorrectly keyed when a single nomatch occurred at the very beginning while all other values matched, [#3441](https://github.com/Rdatatable/data.table/issues/3441). The incorrect key would cause incorrect results in subsequent queries. Thanks to @symbalex for reporting and @franknarf1 for pinpointing the root cause.
 
+12. `rbind` and `rbindlist(..., use.names=TRUE)` with over 255 columns could return the columns in a random order, [#3373](https://github.com/Rdatatable/data.table/issues/3373). The contents and name of each column was correct but the order that the columns appeared in the result might not match the original input.
+
+13. `rbind` and `rbindlist` now combine `integer64` columns together with non-`integer64` columns correctly [#1349](https://github.com/Rdatatable/data.table/issues/1349), and support `raw` columns [#2819](https://github.com/Rdatatable/data.table/issues/2819).
+
+14. `NULL` columns are caught and error appropriately rather than segfault in some cases, [#2303](https://github.com/Rdatatable/data.table/issues/2303) [#2305](https://github.com/Rdatatable/data.table/issues/2305). Thanks to Hugh Parsonage and @franknarf1 for reporting.
+
+15. `melt` would error with 'factor malformed' or segfault in the presence of duplicate column names, [#1754](https://github.com/Rdatatable/data.table/issues/1754). Many thanks to @franknarf1, William Marble, wligtenberg and Toby Dylan Hocking for reproducible examples. All examples have been added to the test suite.
+
 #### NOTES
 
 1. When upgrading to 1.12.0 some Windows users might have seen `CdllVersion not found` in some circumstances. We found a way to catch that so the [helpful message](https://twitter.com/MattDowle/status/1084528873549705217) now occurs for those upgrading from versions prior to 1.12.0 too, as well as those upgrading from 1.12.0 to a later version. See item 1 in notes section of 1.12.0 below for more background.
diff --git a/R/as.data.table.R b/R/as.data.table.R
index e11732067b..a03ba35bb5 100644
--- a/R/as.data.table.R
+++ b/R/as.data.table.R
@@ -108,6 +108,8 @@ as.data.table.array <- function(x, keep.rownames=FALSE, sorted=TRUE, value.name=
 }
 
 as.data.table.list <- function(x, keep.rownames=FALSE, ...) {
+  wn = sapply(x,is.null)
+  if (any(wn)) x = x[!wn]
   if (!length(x)) return( null.data.table() )
   # fix for #833, as.data.table.list with matrix/data.frame/data.table as a list element..
   # TODO: move this entire logic (along with data.table() to C
@@ -125,18 +127,17 @@ as.data.table.list <- function(x, keep.rownames=FALSE, ...) {
   idx = which(n < mn)
   if (length(idx)) {
     for (i in idx) {
-      if (!is.null(x[[i]])) {# avoids warning when a list element is NULL
-        if (inherits(x[[i]], "POSIXlt")) {
-          warning("POSIXlt column type detected and converted to POSIXct. We do not recommend use of POSIXlt at all because it uses 40 bytes to store one date.")
-          x[[i]] = as.POSIXct(x[[i]])
-        }
-        # Implementing FR #4813 - recycle with warning when nr %% nrows[i] != 0L
-        if (!n[i] && mn)
-          warning("Item ", i, " is of size 0 but maximum size is ", mn, ", therefore recycled with 'NA'")
-        else if (n[i] && mn %% n[i] != 0L)
-          warning("Item ", i, " is of size ", n[i], " but maximum size is ", mn, " (recycled leaving a remainder of ", mn%%n[i], " items)")
-        x[[i]] = rep(x[[i]], length.out=mn)
+      # any is.null(x[[i]]) were removed above, otherwise warning when a list element is NULL
+      if (inherits(x[[i]], "POSIXlt")) {
+        warning("POSIXlt column type detected and converted to POSIXct. We do not recommend use of POSIXlt at all because it uses 40 bytes to store one date.")
+        x[[i]] = as.POSIXct(x[[i]])
       }
+      # Implementing FR #4813 - recycle with warning when nr %% nrows[i] != 0L
+      if (!n[i] && mn)
+        warning("Item ", i, " is of size 0 but maximum size is ", mn, ", therefore recycled with 'NA'")
+      else if (n[i] && mn %% n[i] != 0L)
+        warning("Item ", i, " is of size ", n[i], " but maximum size is ", mn, " (recycled leaving a remainder of ", mn%%n[i], " items)")
+      x[[i]] = rep(x[[i]], length.out=mn)
     }
   }
   # fix for #842
diff --git a/R/data.table.R b/R/data.table.R
index 2052d1c921..9081ee0594 100644
--- a/R/data.table.R
+++ b/R/data.table.R
@@ -215,17 +215,6 @@ replace_dot_alias <- function(e) {
   }
 }
 
-# A (relatively) fast (uses DT grouping) wrapper for matching two vectors, BUT:
-# it behaves like 'pmatch' but only the 'exact' matching part. That is, a value in
-# 'x' is matched to 'table' only once. No index will be present more than once.
-# This should make it even clearer:
-# chmatch2(c("a", "a"), c("a", "a")) # 1,2 - the second 'a' in 'x' has a 2nd match in 'table'
-# chmatch2(c("a", "a"), c("a", "b")) # 1,NA - the second one doesn't 'see' the first 'a'
-# chmatch2(c("a", "a"), c("a", "a.1")) # 1,NA - this is where it differs from pmatch - we don't need the partial match.
-chmatch2 <- function(x, table, nomatch=NA_integer_) {
-  .Call(Cchmatch2, x, table, as.integer(nomatch)) # this is in 'rbindlist.c' for now.
-}
-
 "[.data.table" <- function (x, i, j, by, keyby, with=TRUE, nomatch=getOption("datatable.nomatch"), mult="all", roll=FALSE, rollends=if (roll=="nearest") c(TRUE,TRUE) else if (roll>=0) c(FALSE,TRUE) else c(TRUE,FALSE), which=FALSE, .SDcols, verbose=getOption("datatable.verbose"), allow.cartesian=getOption("datatable.allow.cartesian"), drop=NULL, on=NULL)
 {
   # ..selfcount <<- ..selfcount+1  # in dev, we check no self calls, each of which doubles overhead, or could
@@ -369,14 +358,7 @@ chmatch2 <- function(x, table, nomatch=NA_integer_) {
     }
   }
 
-  # To take care of duplicate column names properly (see chmatch2 function above `[data.table`) for description
-  dupmatch <- function(x, y, ...) {
-    if (anyDuplicated(x))
-      pmax(chmatch(x,y, ...), chmatch2(x,y,0L))
-    else chmatch(x,y)
-  }
-
-  # setdiff removes duplicate entries, which'll create issues with duplicated names. Use '%chin% instead.
+  # setdiff removes duplicate entries, which'll create issues with duplicated names. Use %chin% instead.
   dupdiff <- function(x, y) x[!x %chin% y]
 
   if (!missing(i)) {
@@ -739,7 +721,7 @@ chmatch2 <- function(x, table, nomatch=NA_integer_) {
       if (length(tt)) jisvars[tt] = paste0("i.",jisvars[tt])
       if (length(duprightcols <- rightcols[duplicated(rightcols)])) {
         nx = c(names(x), names(x)[duprightcols])
-        rightcols = chmatch2(names(x)[rightcols], nx)
+        rightcols = chmatchdup(names(x)[rightcols], nx)
         nx = make.unique(nx)
       } else nx = names(x)
       ansvars = make.unique(c(nx, jisvars))
@@ -790,20 +772,16 @@ chmatch2 <- function(x, table, nomatch=NA_integer_) {
       if (is.character(j)) {
         if (notj) {
           w = chmatch(j, names(x))
-          if (anyNA(w)) {
-            warning("column(s) not removed because not found: ",paste(j[is.na(w)],collapse=","))
-            w = w[!is.na(w)]
-          }
-          # changed names(x)[-w] to use 'setdiff'. Here, all instances of the column must be removed.
-          # Ex: DT <- data.table(x=1, y=2, x=3); DT[, !"x", with=FALSE] should just output 'y'.
-          # But keep 'dup cols' beause it's basically DT[, !names(DT) %chin% "x", with=FALSE] which'll subset all cols not 'x'.
-          ansvars = if (length(w)) dupdiff(names(x), names(x)[w]) else names(x)
-          ansvals = dupmatch(ansvars, names(x))
+          if (anyNA(w)) warning("column(s) not removed because not found: ",paste(j[is.na(w)],collapse=","))
+          # all duplicates of the name in names(x) must be removed; e.g. data.table(x=1, y=2, x=3)[, !"x"] should just output 'y'.
+          w = !names(x) %chin% j
+          ansvars = names(x)[w]
+          ansvals = which(w)
         } else {
-          # once again, use 'setdiff'. Basically, unless indices are specified in `j`, we shouldn't care about duplicated columns.
-          ansvars = j   # x. and i. prefixes may be in here, and they'll be dealt with below
-          # dups = FALSE here.. even if DT[, c("x", "x"), with=FALSE], we subset only the first.. No way to tell which one the OP wants without index.
-          ansvals = chmatch(ansvars, names(x))
+          # if DT[, c("x","x")] and "x" is duplicated in names(DT), we still subset only the first. Because dups are unusual and
+          # it's more common to select the same column a few times. A syntax would be needed to distinguish these intents.
+          ansvars = j   # x. and i. prefixes may be in here, they'll result in NA and will be dealt with further below if length(leftcols)
+          ansvals = chmatch(ansvars, names(x))   # not chmatchdup()
         }
         if (!length(ansvals)) return(null.data.table())
         if (!length(leftcols)) {
@@ -1019,7 +997,7 @@ chmatch2 <- function(x, table, nomatch=NA_integer_) {
           # over a subset of columns
 
           # all duplicate columns must be matched, because nothing is provided
-          ansvals = dupmatch(ansvars, names(x))
+          ansvals = chmatchdup(ansvars, names(x))
         } else {
           # FR #4979 - negative numeric and character indices for SDcols
           colsub = substitute(.SDcols)
@@ -1432,6 +1410,7 @@ chmatch2 <- function(x, table, nomatch=NA_integer_) {
       setattr(jval, 'class', class(x)) # fix for #5296
       if (haskey(x) && all(key(x) %chin% names(jval)) && suppressWarnings(is.sorted(jval, by=key(x))))  # TO DO: perhaps this usage of is.sorted should be allowed internally then (tidy up and make efficient)
         setattr(jval, 'sorted', key(x))
+      for (i in seq_along(jval)) if (is.null(jval[[i]])) stop("Column ",i," of j evaluates to NULL. A NULL column is invalid.")
     }
     return(jval)
   }
@@ -2432,10 +2411,6 @@ copy <- function(x) {
   alloc.col(newx)
 }
 
-copyattr <- function(from, to) {
-  .Call(Ccopyattr, from, to)
-}
-
 point <- function(to, to_idx, from, from_idx) {
   .Call(CpointWrapper, to, to_idx, from, from_idx)
 }
@@ -2652,13 +2627,19 @@ set <- function(x,i=NULL,j,value)  # low overhead, loopable
   invisible(x)
 }
 
-chmatch <- function(x,table,nomatch=NA_integer_)
-  .Call(Cchmatchwrapper,x,table,as.integer(nomatch[1L]),FALSE) # [1L] to fix #1672
+chmatch <- function(x, table, nomatch=NA_integer_)
+  .Call(Cchmatch, x, table, as.integer(nomatch[1L])) # [1L] to fix #1672
 
-"%chin%" <- function(x,table) {
-  # TO DO  if table has 'ul' then match to that
-  .Call(Cchmatchwrapper,x,table,NA_integer_,TRUE)
-}
+# chmatchdup() behaves like 'pmatch' but only the 'exact' matching part; i.e. a value in
+# 'x' is matched to 'table' only once. No index will be present more than once. For example:
+# chmatchdup(c("a", "a"), c("a", "a")) # 1,2 - the second 'a' in 'x' has a 2nd match in 'table'
+# chmatchdup(c("a", "a"), c("a", "b")) # 1,NA - the second one doesn't 'see' the first 'a'
+# chmatchdup(c("a", "a"), c("a", "a.1")) # 1,NA - this is where it differs from pmatch - we don't need the partial match.
+chmatchdup <- function(x, table, nomatch=NA_integer_)
+  .Call(Cchmatchdup, x, table, as.integer(nomatch[1L]))
+
+"%chin%" <- function(x, table)
+  .Call(Cchin, x, table)  # TO DO  if table has 'ul' then match to that
 
 chorder <- function(x) {
   o = forderv(x, sort=TRUE, retGrp=FALSE)
@@ -2671,7 +2652,6 @@ chgroup <- function(x) {
   if (length(o)) as.vector(o) else seq_along(x)  # as.vector removes the attributes
 }
 
-
 .rbind.data.table <- function(..., use.names=TRUE, fill=FALSE, idcol=NULL) {
   # See FAQ 2.23
   # Called from base::rbind.data.frame
@@ -2681,14 +2661,20 @@ chgroup <- function(x) {
   rbindlist(l, use.names, fill, idcol)
 }
 
-rbindlist <- function(l, use.names=fill, fill=FALSE, idcol=NULL) {
+rbindlist <- function(l, use.names="check", fill=FALSE, idcol=NULL) {
   if (isFALSE(idcol)) { idcol = NULL }
   else if (!is.null(idcol)) {
     if (isTRUE(idcol)) idcol = ".id"
     if (!is.character(idcol)) stop("idcol must be a logical or character vector of length 1. If logical TRUE the id column will named '.id'.")
     idcol = idcol[1L]
   }
-  # fix for #1467, quotes result in "not resolved in current namespace" error
+  miss = missing(use.names)
+  # more checking of use.names happens at C level; this is just minimal to massage 'check' to NA
+  if (identical(use.names, NA)) stop("use.names=NA invalid")  # otherwise use.names=NA could creep in an usage equivalent to use.names='check'
+  if (identical(use.names,"check")) {
+    if (!miss) stop("use.names='check' cannot be used explicitly because the value 'check' is new in v1.12.2 and subject to change. It is just meant to convey default behavior. See ?rbindlist.")
+    use.names = NA
+  }
   ans = .Call(Crbindlist, l, use.names, fill, idcol)
   if (!length(ans)) return(null.data.table())
   setDT(ans)[]
diff --git a/inst/tests/melt_1754.R.gz b/inst/tests/melt_1754.R.gz
new file mode 100644
index 0000000000..f9b56ded80
Binary files /dev/null and b/inst/tests/melt_1754.R.gz differ
diff --git a/inst/tests/melt_1754_synth.csv b/inst/tests/melt_1754_synth.csv
new file mode 100644
index 0000000000..23c4cc3227
--- /dev/null
+++ b/inst/tests/melt_1754_synth.csv
@@ -0,0 +1,40 @@
+state,income,retailprice,percent_15_19,beercons,smoking88,smoking80,smoking75,smoking70,smoking71,smoking72,smoking73,smoking74,smoking75,smoking76,smoking77,smoking78,smoking79,smoking80,smoking81,smoking82,smoking83,smoking84,smoking85,smoking86,smoking87,smoking88,smoking89,smoking90,smoking91,smoking92,smoking93,smoking94,smoking95,smoking96,smoking97,smoking98,smoking99,smoking00
+1,9.678973622,89.34444512,0.174801901,18.95999985,112.0999985,123.1999969,111.6999969,89.80000305,95.40000153,101.0999985,102.9000015,108.1999969,111.6999969,116.1999969,117.0999985,123,121.4000015,123.1999969,119.5999985,119.0999985,116.3000031,113,114.5,116.3000031,114,112.0999985,105.5999985,108.5999985,107.9000015,109.0999985,108.5,107.0999985,102.5999985,101.4000015,104.9000015,106.1999969,100.6999969,96.19999695
+2,9.643623246,89.8777771,0.164611373,18.52000008,121.5,131.8000031,114.8000031,100.3000031,104.0999985,103.9000015,108,109.6999969,114.8000031,119.0999985,122.5999985,127.3000031,126.5,131.8000031,128.6999969,127.4000015,128,123.0999985,125.8000031,126,122.3000031,121.5,118.3000031,113.0999985,116.8000031,126,113.8000031,108.8000031,113,110.6999969,108.6999969,109.5,104.8000031,99.40000153
+4,9.984357198,82.62222205,0.173703247,25.08000031,94.59999847,131,131,124.8000031,125.5,134.3000031,137.8999939,132.8000031,131,134.1999969,132,129.1999969,131.5,131,133.8000031,130.5,125.3000031,119.6999969,112.4000015,109.9000015,102.4000015,94.59999847,88.80000305,87.40000153,90.19999695,88.30000305,88.59999847,89.09999847,85.40000153,83.09999847,81.30000305,81.19999695,79.59999847,73
+5,10.18803512,103.4777764,0.163659688,20.7,104.8000031,118,110.1999969,120,117.5999985,110.8000031,109.3000031,112.4000015,110.1999969,113.4000015,117.3000031,117.5,117.4000015,118,116.4000015,114.6999969,114.0999985,112.5,111,108.5,109,104.8000031,100.5999985,91.5,86.69999695,83.5,79.09999847,76.59999847,79.30000305,76,75.90000153,75.5,73.40000153,71.40000153
+6,9.974561161,90.05555513,0.178224497,26.08000031,137.1000061,150.5,147.6000061,155,161.1000061,156.3000031,154.6999969,151.3000031,147.6000061,153,153.3000031,155.5,150.1999969,150.5,152.6000061,154.1000061,149.6000061,144,144.5,142.3999939,141,137.1000061,131.6999969,127.1999969,118.8000031,120,123.8000031,126.0999985,127.1999969,128.3000031,124.0999985,132.8000031,139.5,140.6999969
+7,9.817172262,84.36666658,0.176944127,21.75999985,124.0999985,134,122.9000015,109.9000015,115.6999969,117,119.8000031,123.6999969,122.9000015,125.9000015,127.9000015,130.6000061,131,134,131.6999969,131.1999969,128.6000061,126.3000031,128.8000031,129,129.3000031,124.0999985,117.0999985,113.8000031,109.5999985,109.1999969,109.1999969,107.8000031,100.3000031,102.6999969,100.5999985,100.5,97.09999847,88.40000153
+8,9.711300956,86.07777786,0.152016721,22.22000008,84.5,115.1999969,123.3000031,102.4000015,108.5,126.0999985,121.8000031,125.5999985,123.3000031,125.0999985,125,122.8000031,117.5,115.1999969,114.0999985,111.5,111.3000031,103.5999985,100.6999969,96.69999695,95,84.5,78.40000153,90.09999847,85.40000153,85.09999847,86.69999695,93,78.19999695,73.59999847,75,78.90000153,75.09999847,66.90000153
+9,10.00688288,89.83333249,0.17028118,24.74000015,107.5999985,135.1999969,131.8000031,124.8000031,125.5999985,126.5999985,124.4000015,131.8999939,131.8000031,134.3999939,134,136.6999969,135.3000031,135.1999969,133,130.6999969,127.9000015,124,121.5999985,118.1999969,109.5,107.5999985,104.5999985,94.09999847,96.09999847,94.80000305,94.59999847,85.69999695,84.30000305,81.80000305,79.59999847,80.30000305,72.19999695,70
+10,9.831646389,81.08888838,0.175089656,21.97999992,134,146.8999939,162.3999939,134.6000061,139.3000031,149.1999969,156,159.6000061,162.3999939,166.6000061,173,150.8999939,148.8999939,146.8999939,148.5,147.6999969,143,137.8000031,135.3000031,137.6000061,134,134,132.5,128.3000031,127.1999969,128.1999969,126.8000031,128.1999969,135.3999939,135.1000061,135.3000031,135.8999939,133.3000031,125.5
+11,9.836926672,90.65555615,0.169908937,23.23999977,100.1999969,124.5999985,120.5,108.5,108.4000015,109.4000015,110.5999985,116.0999985,120.5,124.4000015,125.5,127.0999985,124.1999969,124.5999985,132.8999939,116.1999969,115.5999985,111.1999969,109.4000015,104.0999985,101.0999985,100.1999969,94.40000153,95.40000153,97.09999847,95.19999695,92.5,93.40000153,93,94,93.90000153,94,91.69999695,88.90000153
+12,9.916982969,87.78888872,0.170582652,19.94000015,103.1999969,127.0999985,123.4000015,114,102.8000031,111,115.1999969,118.5999985,123.4000015,127.6999969,127.9000015,127.0999985,126.4000015,127.0999985,132,130.8999939,127.5999985,121.6999969,115.6999969,109.4000015,105.1999969,103.1999969,96.5,94.30000305,91.80000305,90,89.90000153,89.09999847,90.09999847,88.69999695,89.19999695,87.59999847,83.30000305,79.80000305
+13,9.695040385,71.48888991,0.17540008,18.94000015,173.1999969,215.3000031,223,155.8000031,163.5,179.3999939,201.8999939,212.3999939,223,230.8999939,229.3999939,224.6999969,214.8999939,215.3000031,209.6999969,210.6000061,201.1000061,183.1999969,182.3999939,179.8000031,171.1999969,173.1999969,171.6000061,182.5,170.3999939,167.6000061,167.6000061,170.1000061,175.3000031,179,186.8000031,171.3000031,165.3000031,156.1999969
+14,9.747586568,90.0666665,0.181948922,23.87999992,110.9000015,143.8000031,133.6000061,115.9000015,119.8000031,125.3000031,126.6999969,129.8999939,133.6000061,139.6000061,140,142.6999969,140.1000061,143.8000031,144,143.8999939,133.6999969,128.8999939,125,121.1999969,116.5,110.9000015,103.5999985,101.5,107.1999969,108.5,106.1999969,105.3000031,105.6999969,106.8000031,105.3000031,103.1999969,101,104.3000031
+15,9.786902746,91.89999898,0.16571967,22.44000015,125,141.1999969,140.6999969,128.5,133.1999969,136.5,138,142.1000061,140.6999969,144.8999939,145.6000061,143.8999939,138.5,141.1999969,138.8999939,139.5,135.3999939,135.5,127.9000015,119,125,125,122.4000015,117.5,116.0999985,114.5,108.5,101.5999985,102.3000031,100,101.0999985,94.5,85.5,82.90000153
+16,9.938785023,96.28888872,0.172464155,23.12000008,94.09999847,117.6999969,111.5,104.3000031,116.4000015,96.80000305,106.8000031,110.5999985,111.5,116.6999969,117.1999969,118.9000015,118.3000031,117.6999969,120.8000031,119.4000015,113.1999969,110.8000031,113,104.3000031,108.8000031,94.09999847,92.30000305,90.69999695,86.19999695,83.80000305,81.59999847,83.40000153,84.09999847,81.69999695,84.09999847,83.19999695,80.69999695,76
+17,9.546848297,88.92222214,0.181491156,21.14000015,109,127,116.8000031,93.40000153,105.4000015,112.0999985,115,117.0999985,116.8000031,120.9000015,122.0999985,124.9000015,123.9000015,127,125.3000031,125.8000031,122.3000031,116.4000015,115.3000031,113.1999969,110,109,108.3000031,101.8000031,105.5999985,103.9000015,105.4000015,106,107.5,106.9000015,106.3000031,107,103.9000015,97.19999695
+18,9.877391073,84.46666675,0.166553891,23.88000069,127.4000015,142.1000061,135.6000061,121.3000031,127.5999985,130,132.1000061,135.3999939,135.6000061,139.5,140.8000031,141.8000031,140.1999969,142.1000061,140.5,139.6999969,134.1000061,130,129.1999969,128.8000031,128.6999969,127.4000015,122.8000031,119.0999985,119.9000015,122.3000031,121.5999985,119.4000015,124,124.0999985,120.5999985,120.0999985,118,113.8000031
+19,9.753027174,85.5888888,0.16537521,27.87999992,87.09999847,122,123.6999969,111.1999969,115.5999985,122.1999969,119.9000015,121.9000015,123.6999969,124.9000015,127,127.1999969,120.3000031,122,121.0999985,122.4000015,113.6999969,110.0999985,103.5999985,97.80000305,91.69999695,87.09999847,86.19999695,84.69999695,82.90000153,86.59999847,86,88.19999695,90.5,87.30000305,88.90000153,89.09999847,82.59999847,75.5
+20,9.850039482,89.48888906,0.168586676,24.71999969,92.90000153,116.3000031,114.0999985,108.0999985,108.5999985,104.9000015,106.5999985,110.5,114.0999985,118.0999985,117.6999969,117.4000015,116.0999985,116.3000031,117,117.0999985,110.8000031,107.6999969,105.0999985,103.0999985,101.3000031,92.90000153,93.80000305,89.90000153,92.40000153,90.59999847,91.09999847,85.90000153,88.5,86.19999695,85.5,83.09999847,86.59999847,77.59999847
+21,10.02443239,93.24444538,0.162917763,37,141.8999939,177.6999969,205.1999969,189.5,190.5,198.6000061,201.5,204.6999969,205.1999969,201.3999939,190.8000031,187,183.3000031,177.6999969,171.8999939,165.1000061,159.1999969,136.6000061,146.6999969,142.6000061,147.6999969,141.8999939,137.8999939,137.3000031,115.5,110,108.0999985,105.1999969,100.9000015,99,95.59999847,102.4000015,103.9000015,93.19999695
+22,10.00636715,83.39999941,0.169238773,34.95999985,180.3999939,247.8000031,269.1000061,265.7000122,278,296.2000122,279,269.7999878,269.1000061,290.5,278.7999878,269.6000061,254.6000061,247.8000031,245.3999939,239.8000031,232.8999939,215.1000061,201.1000061,195.8999939,195.1000061,180.3999939,172.8999939,152.3999939,144.8000031,143.6999969,148.8999939,153.8000031,158.5,158,174.3999939,173.8000031,171.6999969,147.3000031
+23,9.708400938,87.48888736,0.174311015,27.97999992,77.69999695,102.6999969,103.0999985,90,92.59999847,99.30000305,98.90000153,100.3000031,103.0999985,102.4000015,102.4000015,103.0999985,101,102.6999969,103,97.5,96.30000305,88.90000153,88,88.19999695,82.30000305,77.69999695,74.40000153,70.80000305,69.90000153,71.40000153,69,68.19999695,67,65.69999695,61.79999924,62.59999847,59.70000076,53.79999924
+24,9.751609802,71.44444402,0.179371097,19.92000008,146,187.8000031,226,172.3999939,187.6000061,214.1000061,226.5,227.3000031,226,230.1999969,217,205.5,197.3000031,187.8000031,179.3000031,179,169.8000031,160.6000061,156.3000031,154.3999939,150.5,146,139.3000031,133.6999969,132.6999969,128.8999939,129.6999969,112.6999969,124.9000015,129.6999969,125.5999985,126,113.0999985,109
+25,9.756118351,88.90000068,0.180801443,23.5,87.09999847,123.6999969,117.9000015,93.80000305,98.5,103.8000031,108.6999969,110.5,117.9000015,125.4000015,122.1999969,121.9000015,121.3000031,123.6999969,125.6999969,126.8000031,119.5999985,109.4000015,103.1999969,99.80000305,92.30000305,87.09999847,84.09999847,77.09999847,85.19999695,74.30000305,83,81,80.59999847,80.80000305,77.5,79.09999847,74.69999695,72.5
+26,9.886301253,84.55555513,0.169848301,24.02000008,122.4000015,133.5,122.5,121.5999985,124.5999985,124.4000015,120.5,122.0999985,122.5,124.5999985,127.3000031,131.3000031,130.8999939,133.5,132.8000031,134,130,127.0999985,126.6999969,126.3000031,124.5999985,122.4000015,118.5999985,115.5,113.1999969,112.3000031,108.9000015,108.5999985,111.6999969,107.5999985,108.5999985,106.4000015,104,99.90000153
+27,9.814118067,90.47777812,0.168725929,18.13999977,103.5999985,141.6000061,132.8999939,108.4000015,115.4000015,121.6999969,124.0999985,130.5,132.8999939,138.6000061,140.3999939,143.6000061,141.6000061,141.6000061,143.6999969,147,140,128.1000061,124.1999969,119.9000015,113.0999985,103.5999985,97.5,88.40000153,87.80000305,86.30000305,86.19999695,104.8000031,109.5,110.8000031,111.8000031,112.1999969,111.4000015,108.9000015
+28,9.926845233,89.17777803,0.164436671,25.07999992,107.5999985,124,114.5999985,107.3000031,106.3000031,109,110.6999969,114.1999969,114.5999985,118.8000031,120.0999985,122.3000031,122.5999985,124,125.1999969,123.3000031,125.3000031,115.3000031,115.8000031,113.9000015,110.5999985,107.5999985,107.0999985,101.3000031,102.5,96.19999695,94.69999695,95.40000153,95.40000153,93.30000305,92.90000153,92.09999847,91.09999847,87.90000153
+29,9.931006961,90.22222307,0.175420049,25.54000015,138,149.3000031,154.6999969,123.9000015,123.1999969,134.3999939,142,146.1000061,154.6999969,150.1999969,148.8000031,146.8000031,145.8000031,149.3000031,151.1999969,146.3000031,135.8000031,136.8999939,133.3999939,136.3000031,124.4000015,138,120.8000031,101.4000015,103.5999985,100.0999985,94.09999847,91.90000153,90.80000305,87.5,90,88.69999695,86.90000153,83.09999847
+30,9.673460537,76.61111196,0.184413918,22.9,124.4000015,138.3000031,130.5,103.5999985,115,118.6999969,125.5,129.6999969,130.5,136.8000031,137.1999969,140.3999939,135.6999969,138.3000031,136.1000061,136,131.1000061,127,125.4000015,126.5999985,126.5999985,124.4000015,122.4000015,118.5999985,121.5,112.8000031,115.1999969,112.1999969,109.1999969,102.9000015,124.5,126.9000015,109.4000015,103.9000015
+31,9.702802976,88.54444461,0.173589869,21.26000023,91.90000153,114.6999969,113.5,92.69999695,96.69999695,103,103.5,108.4000015,113.5,116.6999969,115.5999985,116.9000015,117.4000015,114.6999969,115.6999969,113,109.8000031,105.6999969,104.4000015,97,95.80000305,91.90000153,87.40000153,88.30000305,91.80000305,93,91.59999847,94.80000305,98.59999847,92.30000305,88.80000305,88.30000305,83.5,75.09999847
+32,9.737283919,85.17777846,0.171025995,20.57999992,125.3000031,130.3999939,117.4000015,99.80000305,106.3000031,111.5,109.6999969,114.8000031,117.4000015,121.6999969,124.5999985,127.3000031,127.1999969,130.3999939,129.1000061,131.3999939,129,125.0999985,128.6999969,129,130.6000061,125.3000031,124.6999969,121.8000031,120.5999985,121,120.8000031,118.8000031,125.4000015,119.1999969,118.9000015,119.6999969,115.5999985,108.6999969
+33,9.896063487,92.47777854,0.177760008,28.57999992,96.5,129.6999969,116,106.4000015,108.9000015,108.5999985,110.4000015,114.6999969,116,121.4000015,124.1999969,126.5999985,126.4000015,129.6999969,129,131.1999969,126.4000015,117.1999969,115.9000015,113.6999969,105.8000031,96.5,94.5,85.59999847,79.40000153,77.19999695,81.30000305,78.80000305,75.19999695,74.59999847,72.59999847,73.19999695,67.59999847,69.30000305
+34,9.678585158,89.4333335,0.187830499,13.33999996,55,74.80000305,75.80000305,65.5,67.69999695,71.30000305,72.69999695,75.59999847,75.80000305,77.90000153,78,79.59999847,79.09999847,74.80000305,77.59999847,73.59999847,69,66.30000305,66.5,64.40000153,67.69999695,55,57,53.40000153,53.5,55,56.20000076,55.79999924,52,54,57,42.29999924,43.90000153,40.70000076
+35,9.821148766,88.02222273,0.177342362,27.05999985,128.6999969,161.6000061,155.5,122.5999985,124.4000015,138,146.8000031,151.8000031,155.5,171.1000061,169.3999939,162.3999939,160.8999939,161.6000061,163.8000031,162.3000031,153.8000031,144.3000031,144.5,131.1999969,128.3000031,128.6999969,120.9000015,124.3000031,120.9000015,126.5,117.1999969,120.3000031,123.1999969,102.5,97.69999695,97,94.09999847,88.90000153
+36,9.957432535,74.78888914,0.177402943,22.99999962,129.5,148.8999939,152.6999969,124.3000031,128.3999939,137,143.1000061,149.6000061,152.6999969,158.1000061,157.6999969,155.8999939,151.8000031,148.8999939,149.8999939,147.3999939,144.6999969,136.8000031,134.6000061,135.8000031,133,129.5,122.5,118.9000015,109.0999985,108.1999969,105.4000015,106.1999969,106.6999969,104.5999985,108,105.5999985,102.0999985,96.69999695
+37,9.65476354,92.58888753,0.164830512,19.80000038,109.0999985,122.3000031,123.1999969,114.5,111.5,117.5,116.5999985,119.9000015,123.1999969,129.6999969,133.8999939,131.6000061,122.0999985,122.3000031,120.5,119.8000031,115.6999969,111.9000015,109.0999985,112.0999985,107.5,109.0999985,104,104.0999985,100.0999985,97.90000153,111,104.1999969,115.1999969,112.6999969,114.5,114.5999985,112.4000015,107.9000015
+38,9.882993592,95.15555784,0.174546391,32.04000015,102.5999985,117.5999985,113.5,106.4000015,105.4000015,108.8000031,109.5,111.8000031,113.5,115.4000015,117.1999969,116.6999969,117.0999985,117.5999985,119.9000015,115.5999985,106.3000031,105.5999985,107,105.4000015,106,102.5999985,100.3000031,94,95.5,96.19999695,91.19999695,91.80000305,93.5,92.09999847,91.90000153,88.69999695,84.40000153,80.09999847
+39,9.913661109,81.00000042,0.174207053,24.97999992,114.3000031,158.1000061,160.6999969,132.1999969,131.6999969,140,141.1999969,145.8000031,160.6999969,161.5,160.3999939,160.3000031,168.6000061,158.1000061,163.1000061,157.6999969,141.1999969,128.8999939,125.6999969,124.8000031,110.4000015,114.3000031,111.4000015,96.90000153,109.0999985,110.8000031,108.4000015,111.1999969,115,110.3000031,108.8000031,102.9000015,104.8000031,90.5
+3,10.07655864,89.42222341,0.173532382,24.28000031,90.09999847,120.1999969,127.0999985,123,121,123.5,124.4000015,126.6999969,127.0999985,128,126.4000015,126.0999985,121.9000015,120.1999969,118.5999985,115.4000015,110.8000031,104.8000031,102.8000031,99.69999695,97.5,90.09999847,82.40000153,77.80000305,68.69999695,67.5,63.40000153,58.59999847,56.40000153,54.5,53.79999924,52.29999924,47.20000076,41.59999847
diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw
index 2993c05a22..c5d330ab55 100644
--- a/inst/tests/tests.Rraw
+++ b/inst/tests/tests.Rraw
@@ -22,7 +22,7 @@ if (exists("test.data.table", .GlobalEnv, inherits=FALSE)) {
   as.ITime.default = data.table:::as.ITime.default
   binary = data.table:::binary
   brackify = data.table:::brackify
-  chmatch2 = data.table:::chmatch2
+  chmatchdup = data.table:::chmatchdup
   compactprint = data.table:::compactprint
   cube.data.table = data.table:::cube.data.table
   dcast.data.table = data.table:::dcast.data.table
@@ -1017,7 +1017,7 @@ test(350.6, DT[c(0,0,0), .N], 0L)
 # Test recycling list() on RHS of :=
 DT = data.table(a=1:3,b=4:6,c=7:9,d=10:12)
 test(351, DT[,c("a","b"):=list(13:15)], data.table(a=13:15,b=13:15,c=7:9,d=10:12))
-test(352, DT[,letters[1:4]:=list(1L,NULL)], data.table(a=c(1L,1L,1L),c=c(1L,1L,1L)))
+test(352, DT[,letters[1:4]:=list(1L,NULL)], error="Supplied 4 columns to be assigned 2 items. Please see NEWS for v1.12.2")
 
 # Test assigning new levels into factor columns
 DT = data.table(f=factor(c("a","b")),x=1:4)
@@ -1313,9 +1313,9 @@ test(443, rbind(DT,data.table(a=4L,b=7L)), data.table(a=1:4,b=4:7))
 test(444, rbind(DT,list(b=7L,a=4L)), data.table(a=1:4,b=4:7)) # rbind should by default check row names. Don't warn here. Add clear documentation instead.
 test(445, rbind(DT,data.frame(b=7L,a=4L)), data.table(a=1:4,b=4:7))
 test(446, rbind(DT,data.table(b=7L,a=4L)), data.table(a=1:4,b=4:7))
-test(450, rbind(DT,list(c=4L,a=7L)), error="This could be because the items in the list may not ")
-test(451, rbind(DT,data.frame(c=4L,a=7L)), error="This could be because the items in the list may not ")
-test(452, rbind(DT,data.table(c=4L,a=7L)), error="This could be because the items in the list may not ")
+test(450, rbind(DT,list(c=4L,a=7L)), error=tt<-"Column 1 ['c'] of item 2 is missing in item 1. Use fill=TRUE to fill with NA (NULL for list columns)")
+test(451, rbind(DT,data.frame(c=4L,a=7L)), error=tt)
+test(452, rbind(DT,data.table(c=4L,a=7L)), error=tt)
 test(453, rbind(DT,list(4L,7L)), data.table(a=1:4,b=4:7))
 
 # Test new use.names argument in 1.8.0
@@ -1775,13 +1775,11 @@ t = as.ITime(strptime(c("09:10:00","09:11:00","09:11:00","09:12:00"),"%H:%M:%S")
 test(626, unique(t), t[c(1,2,4)])
 test(627, class(unique(t)), "ITime")
 
-# Test recycling list() rbind - with recent C-level changes, this seems not possible (like rbindlist)
-# old test commented.
-# test(628, rbind(data.table(a=1:3,b=5:7,c=list(1:2,1:3,1:4)), list(4L,8L,as.list(1:3))),
-#           data.table(a=c(1:3,rep(4L,3L)),b=c(5:7,rep(8L,3L)),c=list(1:2,1:3,1:4,1L,2L,3L)))
-test(628, rbind(data.table(a=1:3,b=5:7,c=list(1:2,1:3,1:4)), list(4L,8L,as.list(1:3))), error = "inconsistent with first column of that item which is length")
+# Test recycling list() rbind; #524. This was commented out until v1.12.2 when it was reinstated in PR#3455
+test(628.1, rbind(data.table(a=1:3,b=5:7,c=list(1:2,1:3,1:4)), list(4L,8L,as.list(1:3))),
+           data.table(a=c(1:3,rep(4L,3L)),b=c(5:7,rep(8L,3L)),c=list(1:2,1:3,1:4,1L,2L,3L)))
 # Test switch in .rbind.data.table for factor columns
-test(628.5, rbind(data.table(a=1:3,b=factor(letters[1:3]),c=factor("foo")), list(4L,factor("d"),factor("bar"))),
+test(628.2, rbind(data.table(a=1:3,b=factor(letters[1:3]),c=factor("foo")), list(4L,factor("d"),factor("bar"))),
           data.table(a=1:4,b=factor(letters[1:4]),c=factor(c(rep("foo",3),"bar"), levels = c("foo", "bar"))))
 
 # Test merge with common names and all.y=TRUE, #2011
@@ -1930,13 +1928,16 @@ l = list(data.table(a=1:2, b=7:8),
          data.table(b=13:14),
          list(15:16,17L),
          list(c(18,19),20:21))
-test(676, rbindlist(l[1:3]), data.table(a=1:6,b=7:12))
-test(677, rbindlist(l[c(10,1,10,2,10)]), data.table(a=1:4,b=7:10))  # NULL items ignored
+test(676.1, rbindlist(l[1:3]), ans<-data.table(a=1:6,b=7:12), warning="Column 2 [[]'V2'[]] of item 2 is missing in item 1.*Use fill=TRUE.*or use.names=FALSE")
+test(676.2, rbindlist(l[1:3], use.names=FALSE), ans)
+test(677.2, rbindlist(l[c(10,1,10,2,10)]), ans<-data.table(a=1:4,b=7:10), warning="Column 2 [[]'V2'[]] of item 4 is missing in item 2")  # NULL items ignored
+test(677.2, rbindlist(l[c(10,1,10,2,10)], use.names=FALSE), ans)
 test(678, rbindlist(l[c(1,4)]), error="Item 2 has 1 columns, inconsistent with item 1 which has 2")
-test(679, rbindlist(l[c(1:2,5)]), error="Column 2 of item 3 is length 1, inconsistent with first column of that item which is length 2.")
-test(680, rbindlist(l[c(2,6)]), data.table(a=c(3,4,18,19), V2=c(9:10,20:21)))  # coerces 18 and 19 to numeric (with eddi's changes in commit 1012 - highest type is preserved now) --- Caught and changed by Arun on 26th Jan 2014 (in commit 1099).
-### ----> Therefore this TO DO may not be necessary here anymore (added by Arun 26th Jan 2014) ---> # TO DO when options(datatable.pedantic=TRUE): test(680.5, rbindlist(l[c(2,6)]), warning="Column 1 of item 2 is type 'double', inconsistent with column 1 of item 1's type ('integer')")
-test(681, rbindlist(list(data.table(a=letters[1:2],b=c(1.2,1.3),c=1:2), list("c",1.4,3L), NULL, list(letters[4:6],c(1.5,1.6,1.7),4:6))), data.table(a=letters[1:6], b=seq(1.2,1.7,by=0.1), c=1:6))
+test(679.1, rbindlist(l[c(1:2,5)]), ans<-data.table(a=c(1:4,15:16), b=c(7:10,17L,17L)), warning="Column 2 [[]'V2'[]] of item 2 is missing in item 1")
+test(679.2, rbindlist(l[c(1:2,5)], use.names=FALSE), ans)
+test(680, rbindlist(l[c(2,6)]), data.table(a=c(3,4,18,19), V2=c(9:10,20:21)))  # coerces 18 and 19 to numeric
+test(681, rbindlist(list(data.table(a=letters[1:2],b=c(1.2,1.3),c=1:2), list("c",1.4,3L), NULL, list(letters[4:6],c(1.5,1.6,1.7),4:6))),
+          data.table(a=letters[1:6], b=seq(1.2,1.7,by=0.1), c=1:6))
 test(682, rbindlist(NULL), data.table(NULL))
 test(683, rbindlist(list()), data.table(NULL))
 test(684, rbindlist(list(NULL)), data.table(NULL))
@@ -2108,7 +2109,7 @@ test(753.1, DT[,c("x1","x2"):=4:6, verbose = TRUE], data.table(a=letters[1:3],x=
             output = "RHS for item 2 has been duplicated")
 test(753.2, DT[2,x2:=7L], data.table(a=letters[1:3],x=3:1,x1=4:6,x2=c(4L,7L,6L),key="a"))
 DT = data.table(a=letters[3:1],x=1:3,y=4:6)
-test(754, setkey(DT[,c("x1","y1","x2","y2"):=list(x,y)],a), data.table(a=letters[1:3],x=3:1,y=6:4,x1=3:1,y1=6:4,x2=3:1,y2=6:4,key="a"))
+test(754, DT[,c("x1","y1","x2"):=list(x,y)], error="Supplied 3 columns to be assigned 2 items. Please see NEWS for v1.12.2")
 # And non-recycling i.e. that a single column copy does copy the column
 DT = data.table(a=1:3)
 test(754.1, DT[,b:=a][1,a:=4L][2,b:=5L], data.table(a=INT(4,2,3),b=INT(1,5,3)))
@@ -2392,9 +2393,13 @@ test(863, after < before+0.5)
 # Even if data.table is empty, as long as there are column names, they should be considered.
 # Ex: What if all data.tables are empty? What'll be the column name then?
 # If there are no names, then the first non-empty set of names will be allocated.
-test(864.1, rbindlist(list(data.table(foo=logical(0),bar=logical(0)), DT<-data.table(baz=letters[1:3],qux=4:6))), setnames(DT, c("foo", "bar")))
+test(864.1, rbindlist(list(data.table(foo=logical(0),bar=logical(0)), DT<-data.table(baz=letters[1:3],qux=4:6))),
+            setnames(DT, c("foo", "bar")),
+            warning="Column 1 [[]'baz'[]] of item 2 is missing in item 1.*Use fill=TRUE.*or use.names=FALSE.*v1.12.2")  # test 676 tests no warning when use.names=FALSE
 test(864.2, rbindlist(list(list(logical(0),logical(0)), DT<-data.table(baz=letters[1:3],qux=4:6))), DT)
-test(864.3, rbindlist(list(data.table(logical(0),logical(0)), DT<-data.table(baz=letters[1:3],qux=4:6))), setnames(DT, c("V1", "V2")))
+test(864.3, rbindlist(list(data.table(logical(0),logical(0)), DT<-data.table(baz=letters[1:3],qux=4:6))),
+            setnames(DT, c("V1", "V2")),
+            warning="Column 1 [[]'baz'[]] of item 2 is missing in item 1.*Use fill=TRUE.*or use.names=FALSE.*v1.12.2")
 
 # Steve's find that setnames failed for numeric 'old' when pointing to duplicated names
 DT = data.table(a=1:3,b=1:3,v=1:6,w=1:6)
@@ -2924,11 +2929,12 @@ test(1034, as.data.table(x<-as.character(sample(letters, 5))), data.table(V1=x))
   # reshape2 does not need to be loaded to run these.
   # We run these routinely, in dev by cc(), on Travis (coverage) and on CRAN
   set.seed(45)
+  N=18L  # increased in v1.12.2 from 6 to 18 to get NA in f_1 for coverage
   DT <- data.table(
-        i_1 = c(1:5, NA),
-        i_2 = c(NA,6,7,8,9,10),
-        f_1 = factor(sample(c(letters[1:3], NA), 6, TRUE)),
-        c_1 = sample(c(letters[1:3], NA), 6, TRUE),
+        i_1 = c(1:(N-1L), NA),
+        i_2 = c(NA,(N:(2L*N-2L))),
+        f_1 = factor(sample(c(letters[1:3], NA), N, TRUE)),
+        c_1 = sample(c(letters[1:3], NA), N, TRUE),
         d_1 = as.Date(c(1:3,NA,4:5), origin="2013-09-01"),
         d_2 = as.Date(6:1, origin="2012-01-01"))
   DT[, l_1 := DT[, list(c=list(rep(i_1, sample(5,1)))), by = i_1]$c] # generate list cols
@@ -2941,15 +2947,15 @@ test(1034, as.data.table(x<-as.character(sample(letters, 5))), data.table(V1=x))
   test(1036, melt(DT, id=c("i_1", "i_2", "l_2"), measure=c("l_1")), ans1)
 
   # melt retains attributes if all are of same type (new)
-  ans2 = data.table(c_1=DT$c_1, variable=rep(c("d_1", "d_2"), each=6), value=as.Date(c(DT$d_1, DT$d_2)))[!is.na(value)]
+  ans2 = data.table(c_1=DT$c_1, variable=rep(c("d_1", "d_2"), each=N), value=as.Date(c(DT$d_1, DT$d_2)))[!is.na(value)]
   test(1037, melt(DT, id=4, measure=5:6, na.rm=TRUE, variable.factor=FALSE), ans2)
 
   DT2 <- data.table(x=1:5, y=1+5i) # unimplemented class
   test(1038, melt(DT2, id=1), error="Unknown column type 'complex'")
 
   # more tests
-  DT[, f_2 := factor(c("z", "a", "x", "z", "a", "a"), ordered=TRUE)]
-  DT[, id := 1:6]
+  DT[, f_2 := factor(sample(letters, N), ordered=TRUE)]
+  DT[, id := 1:N]
   ans1 = cbind(melt(DT, id="id", measure=5:6, value.name="value1"), melt(DT, id=integer(0), measure=7:8, value.name="value2")[, variable:=NULL])
   levels(ans1$variable) = as.character(1:2)
   test(1038.2, ans1, melt(DT, id="id", measure=list(5:6, 7:8)))
@@ -2966,6 +2972,10 @@ test(1034, as.data.table(x<-as.character(sample(letters, 5))), data.table(V1=x))
   levels(ans$variable) = as.character(1:2)
   test(1038.6, melt(DT, id="id", measure=list(c("c_1", "c_1"), c("f_1", "f_2"))), ans)
 
+  # non ordered factors
+  DT[, f_2 := factor(sample(letters, N), ordered=FALSE)]
+  test(1039, melt(DT, id="id", measure=c("f_1", "f_2"), value.factor=TRUE)$value, factor(c(as.character(DT$f_1), as.character(DT$f_2)), ordered=FALSE))
+
   # test to ensure attributes on non-factor id-columns are preserved after melt
   DT <- data.table(x=1:3, y=letters[1:3], z1=8:10, z2=11:13)
   setattr(DT$x, 'foo', 'bla1')
@@ -3013,6 +3023,72 @@ test(1034, as.data.table(x<-as.character(sample(letters, 5))), data.table(V1=x))
   test(1569.3, melt(dt, id=NULL, measure=-1), error="One or more values in 'measure.vars'")
   test(1569.4, melt(dt, id=5, measure=-1), error="One or more values in 'id.vars'")
   test(1569.5, melt(dt, id=1, measure=-1), error="One or more values in 'measure.vars'")
+
+  if (test_R.utils) {
+    # dup names in variable used to generate malformed factor error and/or segfault, #1754
+    R.utils::decompressFile(testDir("melt_1754.R.gz"), tt<-tempfile(), remove=FALSE, FUN=gzfile, ext=NULL)
+    source(tt, local=TRUE) # creates DT
+    test(1570.01, dim(DT), INT(1,327))
+    test(1570.02, dim(ans<-melt(DT, 1:2)), INT(325,4), warning="All measure variables not of type 'character' will be coerced")
+    test(1570.03, length(levels(ans$variable)), 317L)
+    test(1570.04, levels(ans$variable)[c(1,2,316,317)],
+      tt <- c("Geography",
+        "Estimate; SEX AND AGE - Total population",
+        "Percent; HISPANIC OR LATINO AND RACE - Total housing units",
+        "Percent Margin of Error; HISPANIC OR LATINO AND RACE - Total housing units"))
+    test(1570.05, range(as.integer(ans$variable)), INT(1,317))
+    test(1570.06, as.vector(table(table(as.integer(ans$variable)))), INT(309,8))
+    test(1570.07, sapply(ans, class), c(Id="character",Id2="integer",variable="factor",value="character"))
+    test(1570.08, dim(ans<-melt(DT, 1:2, variable.factor=FALSE)), INT(325,4), warning="All measure variables not of type 'character' will be coerced")
+    test(1570.09, sapply(ans, class), c(Id="character",Id2="integer",variable="character",value="character"))
+    test(1570.10, ans$variable[c(1,2,324,325)], tt)
+  }
+
+  # more from #1754
+  DT = fread(testDir("melt_1754_synth.csv"))
+  test(1571.1, names(DT)[duplicated(names(DT))], c("smoking75","smoking80","smoking88"))
+  test(1571.2, dim(ans<-melt(DT, id.vars=c("state","income","retailprice","percent_15_19","beercons"), measure=patterns("^smoking"))), INT(1326,7))
+  test(1571.3, print(ans[c(1,1326)]), output="state.*income.*retailprice.*percent_15_19.*beercons.*variable.*value.*1.*9.6.*89.34.*smoking88.*smoking00.*41.6")
+
+  # more from #1754
+  DT = setDT(data.frame("Time.point" = seq(0, 6), "Time.(h)" = c(0.0, 0.5, 1.0, 3.0, 5.0, 7.0, 24.0),
+                        "NEW.ME" = runif(7), "NEW.ME" = runif(7), check.names = FALSE))
+  test(1572.1, dim(melt(DT, c("Time.point", "Time.(h)"), na.rm=TRUE)), INT(14, 4))
+  DT = setDT(data.frame("Time.point" = seq(0, 6), "Time.(h)" = c(0.0, 0.5, 1.0, 3.0, 5.0, 7.0, 24.0),
+     "NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.ME" = runif(7),
+     "NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.MER" = runif(7), "F050" = runif(7),
+     "NEW.MER" = runif(7), "F16-42-123p123C" = runif(7), "F16-42-123p123C" = runif(7), "NEW.MER" = runif(7),
+     "F16-42-123p123C" = runif(7), check.names = FALSE))
+  test(1572.2, unique(names(DT)[duplicated(names(DT))]), c("NEW.ME","NEW.MER","F16-42-123p123C"))
+  test(1572.3, dim(melt(DT, c("Time.point", "Time.(h)"), na.rm = TRUE)), INT(105,4))
+
+  # more from #1754
+  DT = fread(
+"month,Record high,Average high,Daily mean,Average low,Record low,Average precipitation,Average rainfall,Average snowfall,Average precipitation,Average rainy,Average snowy,Mean monthly sunshine hours
+Jan,12.8,-5.4,-8.9,-12.4,-33.5,73.6,28.4,45.9,15.8,4.3,13.6,99.2
+Feb,15,-3.7,-7.2,-10.6,-33.3,70.9,22.7,46.6,12.8,4,11.1,119.5
+Mar,25.9,2.4,-1.2,-4.8,-28.9,80.2,42.2,36.8,13.6,7.4,8.3,158.8
+Apr,30.1,11,7,2.9,-17.8,76.9,65.2,11.8,12.5,10.9,3,181.7
+May,34.2,19,14.5,10,-5,86.5,86.5,0.4,12.9,12.8,0.14,229.8
+Jun,34.5,23.7,19.3,14.9,1.1,87.5,87.5,0,13.8,13.8,0,250.1
+Jul,36.1,26.6,22.3,17.9,7.8,106.2,106.2,0,12.3,12.3,0,271.6
+Aug,35.6,24.8,20.8,16.7,6.1,100.6,100.6,0,13.4,13.4,0,230.7
+Sep,33.5,19.4,15.7,11.9,0,100.8,100.8,0,12.7,12.7,0,174.1")
+  test(1573, print(melt(DT, id.vars="month", verbose=TRUE)), output="'measure.vars' is missing.*Assigned.*are.*Record high.*1:.*Jan.*Record high.*12.8.*108:.*Sep.*sunshine hours.*174.1")
+
+  # coverage of reworked fmelt.c:getvarcols; #1754
+  # missing id satisfies data->lvalues!=1 at C level to test those branches
+  x = data.table(x1=1:2, x2=3:4, y1=5:6, y2=7:8, z1=9:10, z2=11:12)
+  test(1574.1, dim(ans<-melt(x, measure.vars=patterns("^y", "^z"))), INT(4,5))
+  test(1574.2, ans$variable, factor(c("1","1","2","2")))
+  test(1574.3, dim(ans<-melt(x, measure.vars=patterns("^y", "^z"), variable.factor=FALSE)), INT(4,5))
+  test(1574.4, ans$variable, c("1","1","2","2"))
+  x[, c("y1","z1"):=NA]
+  test(1574.5, dim(melt(x, measure.vars=patterns("^y", "^z"))), INT(4,5))
+  test(1574.6, dim(ans<-melt(x, measure.vars=patterns("^y", "^z"), na.rm=TRUE)), INT(2,5))
+  test(1574.7, ans$variable, factor(c("1","1")))
+  test(1574.8, dim(ans<-melt(x, measure.vars=patterns("^y", "^z"), na.rm=TRUE, variable.factor=FALSE)), INT(2,5))
+  test(1574.9, ans$variable, c("1","1"))
 }
 
 # sorting and grouping of Inf, -Inf, NA and NaN,  #4684, #4815 & #4883
@@ -3485,29 +3561,28 @@ test(1118, dt[, lapply(.SD, function(y) weighted.mean(y, b2, na.rm=TRUE)), by=x]
 DT <- data.table(x=5:1, y=1:5, key="y")
 test(1119, is.null(key(DT[, list(z = y, y = 1/y)])))
 
-
 ## various ordered factor rbind tests
-DT = data.table(ordered('a', levels = c('a','b','c')))
-DT1 = data.table(factor('a', levels = c('b','a','f')))
-DT2 = data.table(ordered('b', levels = c('b','d','c')))
-DT3 = data.table(c('foo', 'bar'))
-DT4 = data.table(ordered('a', levels = c('b', 'a')))
-
-test(1120, rbind(DT, DT1, DT2, DT3), data.table(ordered(c('a','a','b', 'foo', 'bar'), levels = c('a','b','d','c','f', 'foo', 'bar'))))
-test(1121, rbindlist(list(DT, DT1, DT2, DT3)), data.table(ordered(c('a','a','b', 'foo', 'bar'), levels = c('a','b','d','c','f', 'foo', 'bar'))))
-test(1122, rbind(DT, DT4), data.table(factor(c('a','a'), levels = c('a','b','c'))), warning="ordered factor levels cannot be combined, going to convert to simple factor instead")
-test(1123, rbindlist(list(DT, DT4)), data.table(factor(c('a','a'), levels = c('a','b','c'))), warning="ordered factor levels cannot be combined, going to convert to simple factor instead")
-test(1124, rbind(DT1, DT1), data.table(factor(c('a','a'), levels = c('b','a','f'))))
-test(1125, rbindlist(list(DT1, DT1)), data.table(factor(c('a','a'), levels = c('b','a','f'))))
-
-# coverage of rbindlist.c:289, #2346.
-# The hashing there hashes pointer address (CHARSXP) so this test attempts to use a large enough
-# sample of unique strings to generate that condition reliably to test that collision branch.
+DT1 = data.table(ordered('a', levels = c('a','b','c')))
+DT2 = data.table(factor('a', levels = c('b','a','f')))
+DT3 = data.table(ordered('b', levels = c('b','d','c')))
+DT4 = data.table(c('foo', 'bar'))
+DT5 = data.table(ordered('b', levels = c('b','a')))
+test(1120.1, rbind(DT1, DT2, DT3, DT4), ans<-data.table(factor(c('a','a','b','foo','bar'), levels = c('a','b','c','f','d', 'foo', 'bar'))),
+             warning=w<-"Column 1 of item 3.*level 2 [[]'d'[]] is missing from the ordered levels from column 1 of item 1.*regular factor")
+test(1120.2, rbindlist(list(DT1, DT2, DT3, DT4)), ans, warning=w)
+test(1121.1, rbind(DT1, DT5), ans<-data.table(factor(c('a','b'), levels = c('a','b','c'))), warning=w<-"'b'<'a'.*But 'a'<'b'.*regular factor")
+test(1121.2, rbindlist(list(DT1, DT5)), ans, warning=w)
+test(1122.1, rbind(DT2, DT2), data.table(factor(c('a','a'), levels = c('b','a','f'))))
+test(1122.2, rbindlist(list(DT2, DT2)), data.table(factor(c('a','a'), levels = c('b','a','f'))))
+test(1123.1, rbind(DT2,DT5), data.table(ordered(c('a','b'), levels=c('b','a','f'))))
+test(1123.2, rbind(DT5,DT2), data.table(ordered(c('b','a'), levels=c('b','a','f'))))
+
+# Old test to cover pre-PR#3455 rbindlist.c:289, #2346 (hashing CHARSXP no longer done)
 set.seed(1)
 manyChars = paste0("id",sample(99999,10000))
 DT1 = data.table(ordered(sample(manyChars, 1000), levels=sample(manyChars)))
 DT2 = data.table(factor(sample(manyChars, 1000)))
-test(1125.1, rbindlist(list(DT1,DT2))[c(1,2,.N-1,.N),as.character(V1)], c("id85645","id80957","id73436","id33445"))
+test(1125, rbindlist(list(DT1,DT2))[c(1,2,.N-1,.N),as.character(V1)], c("id85645","id80957","id73436","id33445"))
 
 ## test rbind(..., fill = TRUE)
 DT = data.table(a = 1:2, b = 1:2)
@@ -3908,22 +3983,20 @@ A <- data.table(x=factor(1), key='x')
 B <- data.table(x=factor(), key='x')
 test(1168.1, rbindlist(list(B,A)), data.table(x=factor(1)))
 
-# fix for bug #5120, it's related to rbind and factors as well - more or less similar to 1168.1 (#5355). Seems to have been fixed with that commit. Just adding test here.
+# fix for bug #5120, it's related to rbind and factors as well - more or less similar to 1168.1 (#5355).
 tmp1 <- as.data.table(structure(list(Year = 2013L, Maturity = structure(1L, .Label = c("<1",
-"1.0 - 1.5", "1.5 - 2.0", "2.0 - 2.5", "2.5 - 3.0", "3.0 - 4.0",
-"4.0 - 5.0", ">5.0"), class = "factor"), Quality = structure(2L, .Label = c(">BBB",
-"BBB", "BB", "B", "CCC", "<CCC", "NR", "CASH"), class = c("ordered",
-"factor")), Ct = 2L, Wt = 1.56, CtTotRet = 1.08, TotRet = 69.2307692307692), .Names = c("Year",
-"Maturity", "Quality", "Ct", "Wt", "CtTotRet", "TotRet"), class = c("data.table",
-"data.frame"), row.names = c(NA, -1L)))
-
+  "1.0 - 1.5", "1.5 - 2.0", "2.0 - 2.5", "2.5 - 3.0", "3.0 - 4.0",
+  "4.0 - 5.0", ">5.0"), class = "factor"), Quality = structure(2L, .Label = c(">BBB",
+  "BBB", "BB", "B", "CCC", "<CCC", "NR", "CASH"), class = c("ordered",
+  "factor")), Ct = 2L, Wt = 1.56, CtTotRet = 1.08, TotRet = 69.2307692307692), .Names = c("Year",
+  "Maturity", "Quality", "Ct", "Wt", "CtTotRet", "TotRet"), class = c("data.table",
+  "data.frame"), row.names = c(NA, -1L)))
 tmp2 <- as.data.table(structure(list(Year = 2013L, Maturity = "TOTAL", Quality = "TOTAL",
-Ct = 214L, Wt = 100.001, CtTotRet = 406.26, TotRet = 406.255937440626), .Names = c("Year",
-"Maturity", "Quality", "Ct", "Wt", "CtTotRet", "TotRet"), class = c("data.table",
-"data.frame"), row.names = c(NA, -1L)))
-
-ans <- rbind(tmp1, tmp2)
-test(1168.2, as.data.frame(ans), rbind(as.data.frame(tmp1), as.data.frame(tmp2)))
+  Ct = 214L, Wt = 100.001, CtTotRet = 406.26, TotRet = 406.255937440626), .Names = c("Year",
+  "Maturity", "Quality", "Ct", "Wt", "CtTotRet", "TotRet"), class = c("data.table",
+  "data.frame"), row.names = c(NA, -1L)))
+# "TOTAL" is added to the end of the ordered levels and ordered factor retained
+test(1168.2, as.data.frame(rbind(tmp1,tmp2)), rbind(as.data.frame(tmp1), as.data.frame(tmp2)))
 
 # checks of "" and NA_character_ ordering.
 test(1169, forderv(c(NA,"","a","NA")), INT(1,2,4,3))  # data.table does ascii ordering currently, so N comes before a
@@ -4605,17 +4678,17 @@ test(1287, ans, data.table(Time=a$Time, demand.x=NA_real_, demand.y=a$demand, ke
 ll <- list(data.table(x=1, y=-1, x=-2), data.table(y=10, y=20, y=30, x=-10, a="a", b=Inf, c=factor(1)))
 test(1288.01, rbindlist(ll, use.names=TRUE, fill=FALSE), error = "Item 2 has 7 columns, inconsistent with item 1 which has 3 columns")
 # modified after fixing #725
-test(1288.02, rbindlist(ll, use.names=TRUE, fill=TRUE),
-    data.table(x=c(1,-10), x=c(-2, NA), y=c(-1,10), y=c(NA,20), y=c(NA,30), a=c(NA, "a"), b=c(NA, Inf), c=factor(c(NA, 1))))
+test(1288.02, rbindlist(ll, use.names=TRUE, fill=TRUE),   # dups were grouped before 1.12.2; now order of dups is retained; #3455
+    data.table(x=c(1,-10), y=c(-1,10), x=c(-2, NA), y=c(NA,20), y=c(NA,30), a=c(NA, "a"), b=c(NA, Inf), c=factor(c(NA, 1))))
 
 # check the name of output are consistent when binding two empty dts with one empy and other non-empty dt
 dt1 <- data.table(x=1:5, y=6:10)
 dt2 <- dt1[x > 5]
 setnames(dt3 <- copy(dt2), c("A", "B"))
-test(1288.03, names(rbindlist(list(dt2,dt3))), c("x", "y"))
-test(1288.04, names(rbindlist(list(dt3,dt2))), c("A", "B"))
-test(1288.05, names(rbindlist(list(dt1,dt3))), c("x", "y"))
-test(1288.06, names(rbindlist(list(dt3,dt1))), c("A", "B"))
+test(1288.03, names(rbindlist(list(dt2,dt3), use.names=FALSE)), c("x", "y"))  # use.names=FALSE to avoid new warning in v1.12.2; PR#3455
+test(1288.04, names(rbindlist(list(dt3,dt2), use.names=FALSE)), c("A", "B"))
+test(1288.05, names(rbindlist(list(dt1,dt3), use.names=FALSE)), c("x", "y"))
+test(1288.06, names(rbindlist(list(dt3,dt1), use.names=FALSE)), c("A", "B"))
 
 # check fix for bug #5612
 DT <- data.table(x=c(1,2,3))
@@ -4638,13 +4711,14 @@ test(1288.11, rbindlist(ll, use.names=TRUE), data.table(a=c(1:3, 5:7), b=c(4:6,
 ll <- list(list(1:3, 4:6), list(a=5:7, b=8:10))
 test(1288.12, rbindlist(ll, use.names=TRUE), data.table(a=c(1:3, 5:7), b=c(4:6, 8:10)))
 ll <- list(list(a=1:3, 4:6), list(5:7, b=8:10))
-test(1288.13, rbindlist(ll, use.names=TRUE), error="Answer requires 3 columns whereas one or more item(s) in the input list has only 2 columns. This could be because the items in the list may not")
+test(1288.13, rbindlist(ll, use.names=TRUE), error="Column 2 ['b'] of item 2 is missing in item 1. Use fill=TRUE to fill with NA")
 ll <- list(list(a=1:3, 4:6), list(5:7, b=8:10))
 test(1288.14, rbindlist(ll, fill=TRUE), data.table(a=c(1:3, rep(NA_integer_,3L)), V1=c(4:6,5:7), b=c(rep(NA_integer_, 3L), 8:10)))
 ll <- list(list(1:3, 4:6), list(5:7, 8:10))
-test(1288.15, rbindlist(ll, fill=TRUE), error="fill=TRUE, but names of input list at position 1")
-ll <- list(list(1:3, 4:6), list(a=5:7, b=8:10))
-test(1288.16, rbindlist(ll, fill=TRUE), error="fill=TRUE, but names of input list at position 1")
+test(1288.15, rbindlist(ll, fill=TRUE), error="use.names=TRUE but no item of input list has any names")
+ll <- list(list(1:3, 6:8), list(a=4:5, b=9:10))
+test(1288.16, rbindlist(ll), data.table(a=1:5, b=6:10))
+test(1288.17, rbindlist(ll, fill=TRUE), data.table(a=1:5, b=6:10))
 
 # fix for #5647
 dt = data.table(x=1L, y=1:10)
@@ -6205,14 +6279,16 @@ test(1454.2, fread('"Foo"\n5\n',sep="`"), data.table(Foo=5L))
 DT <- data.table(a=c(1, 1, 1, 0, 0), b=c("A", "B", "A1", "A", "B"))
 test(1455, DT[, nrow(.SD[b == 'B']), by=.(a)], data.table(a=c(1,0), V1=1L))
 
-# Test for chmatch2 bug fix
+# chmatchdup ...
 x1 = c("b", "a", "d", "a", "c", "a")
 x2 = c("a", "a", "a")
 x3 = c("d", "a", "a", "d", "a")
 table = rep(letters[1:3], each=2)
-test(1456.1, chmatch2(x1, table), as.integer(c(3,1,NA,2,5,NA)))
-test(1456.2, chmatch2(x2, table), as.integer(c(1,2,NA)))
-test(1456.3, chmatch2(x3, table), as.integer(c(NA,1,2,NA,NA)))
+test(1456.1, chmatchdup(x1, table), as.integer(c(3,1,NA,2,5,NA)))
+test(1456.2, chmatchdup(x2, table), as.integer(c(1,2,NA)))
+test(1456.3, chmatchdup(x3, table), as.integer(c(NA,1,2,NA,NA)))
+test(1457.1,   chmatchdup(c("x","x","x","x"), c("x","y","x","x","y","z")), INT(1,3,4,NA))
+test(1457.2, base::pmatch(c("x","x","x","x"), c("x","y","x","x","y","z")), INT(1,3,4,NA))
 
 # Add tests for which_
 x = sample(c(-5:5, NA), 25, TRUE)
@@ -6739,8 +6815,8 @@ test(1493, dt[, .(x=sum(x)),by= x %% 2, verbose=TRUE], data.table(`x%%2`=c(1,0),
 # Fix for #705
 DT1 = data.table(date=as.POSIXct("2014-06-22", format="%Y-%m-%d", tz="GMT"))
 DT2 = data.table(date=as.Date("2014-06-23"))
-test(1494.1, rbind(DT1, DT2), error="Class attributes at column")
-test(1494.2, rbind(DT2, DT1), error="Class attributes at column")
+test(1494.1, rbind(DT1, DT2), error="Class attribute on column")
+test(1494.2, rbind(DT2, DT1), error="Class attribute on column")
 
 # test 1495 has been added to melt's test section (fix for #1055)
 
@@ -13626,6 +13702,135 @@ dx = data.table(id = 1L, key = "id")
 di = list(z=c(2L, 1L))
 test(1999.2, key(dx[di]), NULL)
 
+# chmatchdup test from benchmark at the bottom of chmatch.c
+set.seed(45L)
+x = sample(letters, 1e5, TRUE)
+y = sample(letters, 1e6, TRUE)
+test(2000, c(head(ans<-chmatchdup(x,y,0L)),tail(ans)), INT(7,49,11,20,69,25,99365,100750,97596,99671,103320,99406))
+rm(list=c("x","y"))
+
+# rbindlist use.names=TRUE returned random column order when ncol>255; #3373
+DT = setDT(replicate(300, rnorm(3L), simplify = FALSE))
+test(2001.1, colnames(rbind(DT[1], DT[3])), colnames(DT))
+# and use.names=TRUE keeps dups in original location; mentioned in #3373
+DT1 = data.table(a=1L, b=3L, c=5L, b=7L)
+DT2 = data.table(a=2L, b=4L, c=6L, b=8L)
+test(2001.2, rbind(DT1, DT2, use.names = TRUE), data.table(a=1:2, b=3:4, c=5:6, b=7:8))  # dup of b at the end; was a,b,b,c
+
+# rbindlist now fills NULL and empty columns with NA with warning, #1871
+test(2002.1, rbindlist( list(list(a=1L, b=2L, x=NULL), list(a=2L, b=3L, x=10L)) ),
+             data.table(a=1:2, b=2:3, x=INT(NA,10)),
+             warning="Column 3 ['x'] of item 1 is length 0. This (and 0 others like it) has been filled with NA (NULL for list columns) to make each item uniform.")
+test(2002.2, rbindlist( list(list(a=1L, b=2L, x=NULL), list(a=2L, b=NULL, x=10L)) ),
+             data.table(a=1:2, b=INT(2,NA), x=INT(NA,10)),
+             warning="Column 3 ['x'] of item 1 is length 0. This (and 1 other like it) has been filled with NA (NULL for list columns) to make each item uniform.")
+test(2002.3, rbindlist( list(list(a=1L, b=2L, x=NULL), list(a=2L, b=NULL, x=NULL)) ),
+             data.table(a=1:2, b=INT(2,NA), x=c(NA,NA)),
+             warning="Column 3 ['x'] of item 1 is length 0. This (and 2 others like it) has been filled with NA (NULL for list columns) to make each item uniform.")
+# tests from #1302
+test(2002.4, rbindlist( list(list(a=1L,z=list()), list(a=2L, z=list("m"))) ),
+             data.table(a=1:2, z=list(NULL, "m")),
+             warning="Column 2 ['z'] of item 1 is length 0. This (and 0 others like it) has been filled with NA")
+test(2002.5, rbindlist( list( list(a=1L, z=list("z")), list(a=2L, z=list(c("a","b"))) )),
+             data.table(a=1:2, z=list("z", c("a","b"))))
+test(2002.6, rbindlist( list( list(a=1:2, z=list("z",1,"k")), list(a=2, z=list(c("a","b"))) )),
+             error="Column 1 of item 1 is length 2 inconsistent with column 2 which is length 3. Only length-1 columns are recycled.")
+test(2002.7, rbindlist( list(list(a=1L, z=list(list())), list(a=2L, z=list(list("m")))) ),
+             data.table(a=1:2, z=list(list(),list("m"))))
+test(2002.8, rbindlist( list(list(a=1L, z=list(list("z"))), list(a=2L, z=list(list(c("a","b"))))) ),
+             data.table(a=1:2, z=list(list("z"), list(c("a","b")))))
+test(2002.9, rbindlist( list(list(a=1L, z=list(list("z",1))), list(a=2L, z=list(list(c("a","b"))))) ),
+             data.table(a=1:2, z=list(list("z",1), list(c("a","b")))))
+# tests from #3343
+DT1=list(a=NULL); setDT(DT1)
+DT2=list(a=NULL); setDT(DT2)
+test(2002.10, rbind(DT1, DT2),                 data.table(a=logical()))
+test(2002.11, rbind(A=DT1, B=DT2, idcol='id'), data.table(id=character(), a=logical()))
+test(2002.12, rbind(DT1, DT2, idcol='id'),     data.table(id=integer(), a=logical()))
+
+#rbindlist coverage
+test(2003.1, rbindlist(list(), use.names=1), error="use.names= should be TRUE, FALSE, or not used [(]\"check\" by default[)]")
+test(2003.2, rbindlist(list(), fill=1), error="fill= should be TRUE or FALSE")
+test(2003.3, rbindlist(list(data.table(a=1:2), data.table(b=3:4)), fill=TRUE, use.names=FALSE),
+             data.table(a=c(1:2,NA,NA), b=c(NA,NA,3:4)),
+             warning="use.names= cannot be FALSE when fill is TRUE. Setting use.names=TRUE")
+
+# chmatch coverage for two different non-ascii encodings matching; issues mentioned in comments in chmatch.c #5159 #2538 #4818
+x1 = "fa\xE7ile"
+Encoding(x1) = "latin1"
+x2 = iconv(x1, "latin1", "UTF-8")
+test(2004.1, identical(x1,x2))
+test(2004.2, Encoding(x1)!=Encoding(x2))
+test(2004.3, chmatch(c("a",x1,"b"), x2), c(NA,1L,NA))       # x contains mixed; covers first fallback in chmatchMain
+test(2004.4, c("a",x1,"b") %chin% x2, c(FALSE,TRUE,FALSE))  # and the chin switch in the same fallback
+test(2004.5, chmatch(c("a","b"), c("b",x1)), c(NA,1L))      # x doesn't contain encodings so covers the second fallback in chmatchMain
+test(2004.6, chmatch(c("a","b"), c("b",x2)), c(NA,1L))      #   the second fallback might be redundnant though; see comments in chmatch.c
+test(2004.7, c("a","b") %in% c("b",x1,x2), c(FALSE, TRUE))  #   the second fallback might be redundnant though; see comments in chmatch.c
+
+# more coverage ...
+test(2005.1, truelength(NULL), 0L)
+DT = data.table(a=1:3, b=4:6)
+test(2005.2, set(DT, 4L, "b", NA), error="i[1] is 4 which is out of range [1,nrow=3]")
+test(2005.3, set(DT, 3L, 8i, NA), error="j is type 'complex'. Must be integer, character, or numeric is coerced with warning.")
+test(2005.4, set(DT, 1L, 2L, expression(x+2)), error="RHS of assignment is not NULL, not an an atomic vector (see ?is.atomic) and not a list column.")
+DT[,foo:=factor(c("a","b","c"))]
+test(2005.5, DT[2, foo:=8i], error="Can't assign to column 'foo' (type 'factor') a value of type 'complex' (not character, factor, integer or numeric)")
+test(2005.6, DT[2, a:=9, verbose=TRUE], output="Coerced length-1 RHS from double to integer to match column's type. No precision was lost. If this")
+test(2005.7, DT[2, a:=NA, verbose=TRUE], output="Coerced length-1 RHS from logical to integer to match column's type. If this")
+test(2005.8, DT[2, a:=9.9]$a, INT(1,9,3), warning="Coerced double RHS to integer.*One or more RHS values contain fractions which have been lost.*9.9.*has been truncated to 9")
+
+# rbindlist raw type, #2819
+test(2006.1, rbindlist(list(data.table(x = as.raw(1), y=as.raw(3)), data.table(x = as.raw(2))), fill=TRUE), data.table(x=as.raw(1:2), y=as.raw(c(3,0))))
+test(2006.2, rbindlist(list(data.table(x = as.raw(1:2), y=as.raw(5:6)), data.table(x = as.raw(3:5))), fill=TRUE), data.table(x=as.raw(1:5), y=as.raw(c(5:6,0,0,0))))
+
+# rbindlist integer64, #1349
+if (test_bit64) {
+  test(2007.1, rbindlist(list( list(a=as.integer64(1), b=3L),  list(a=2L, b=4L) )), data.table(a=as.integer64(1:2), b=3:4))
+  test(2007.2, rbindlist(list( list(a=3.4, b=5L),  list(a=as.integer64(4), b=6L) )), data.table(a=as.integer64(3:4), b=5:6),
+               warning="Column 1 of item 1: coerced to integer64 but contains a non-integer value [(]3.40.* at position 1[)]; precision lost")
+  test(2007.3, rbindlist(list( list(a=3.0, b=5L),  list(a=as.integer64(4), b=6L) )), data.table(a=as.integer64(3:4), b=5:6))
+  test(2007.4, rbindlist(list( list(b=5:6),  list(a=as.integer64(4), b=7L)), fill=TRUE), data.table(b=5:7, a=as.integer64(c(NA,NA,4))))  # tests writeNA of integer64
+  test(2007.5, rbindlist(list( list(a=INT(1,NA,-2)),  list(a=as.integer64(c(3,NA))) )), data.table(a=as.integer64(c(1,NA,-2,3,NA))))   # int NAs combined with int64 NA
+  test(2007.6, rbind(data.table(a=as.raw(10), b=5L),  data.table(a=as.integer64(11), b=6L)), data.table(a=as.integer64(10:11), b=5:6))
+}
+
+# reworked ordered-factor handling in PR#3455, expanded from test for #3032
+DT1 = data.table(x = ordered(vals<-c("b","b","e","f","c","c"), levels=c("f","b","e","c")))
+DT2 = data.table(x = ordered(vals,                             levels=c("f","e","b","c")))
+DT3 = data.table(x = ordered(vals,                             levels=c("f","b","e","c","d")))
+DT4 = data.table(x = ordered(vals,                             levels=c("f","b","e","c","a","p")))
+test(2008.1, DT1$x[3] < DT1$x[5])  # e<c;  just to remind what ordered factors are
+test(2008.2, factor(DT1$x, ordered=FALSE)[3] < factor(DT1$x, ordered=FALSE)[5], NA, warning="<.*not meaningful for factors")  # base R's nice warning
+test(2008.3, rbind(DT1, DT4), data.table(x=ordered(c(vals,vals), levels=c("f","b","e","c","a","p"))))
+test(2008.4, rbind(DT1, DT2), data.table(x=factor(c(vals,vals), levels=c("f","b","e","c"))),
+             warning="Column 1 of item 2 is an ordered factor with 'e'<'b' in its levels. But 'b'<'e' in the ordered levels from column 1 of item 1.*regular factor")
+test(2008.5, rbind(DT3, DT4), data.table(x=factor(c(vals,vals), levels=c("f","b","e","c","a","p","d"))),
+             warning="Column 1 of item 1.*level 5 [[]'d'[]] is missing.*column 1 of item 2. Each set.*should be an ordered subset of the first longest.*regular factor")
+test(2008.6, rbindlist(list(DT1, DT2, DT3, DT4)), data.table(x=factor(rep(vals,4), levels=c("f","b","e","c","a","p","d"))),
+             warning="'e'<'b'.*But 'b'<'e'")
+test(2008.7, rbindlist(list(DT1, list(c("e","b")), DT1)), data.table(x=ordered(c(vals,"e","b",vals), levels=c("f","b","e","c"))))
+test(2008.8, rbindlist(list(DT1, list(c("e","foo")), DT1)), data.table(x=ordered(c(vals,"e","foo",vals), levels=c("f","b","e","c","foo"))))
+
+# segfault comparing NULL column, #2303 #2305
+DT = structure(list(NULL), names="a", class=c("data.table","data.frame"))
+test(2009.1, DT[a>1], error="Internal error: column 1 of data.table is NULL; malformed")
+DT = null.data.table()
+x = NULL
+test(2009.2, DT[, .(x)], error="Column 1 of j evaluates to NULL. A NULL column is invalid.")
+test(2009.3, data.table(character(0), NULL), error="column or argument 2 is NULL")
+test(2009.4, as.data.table(list(y = character(0), x = NULL)), data.table(y=character()))
+
+# use.names=NA warning for out-of-order; https://github.com/Rdatatable/data.table/pull/3455#issuecomment-472744347
+DT1 = data.table(a=1:2, b=5:6)
+DT2 = data.table(b=7:8, a=3:4)
+test(2010.1, rbindlist(list(DT1,DT2)), ans<-data.table(a=c(1:2,7:8), b=c(5:6,3:4)),
+             warning="Column 2 [[]'a'[]] of item 2 appears in position 1 in item 1.*use.names=TRUE.*or use.names=FALSE.*v1.12.2")
+test(2010.2, rbindlist(list(DT1,DT2), use.names=FALSE), ans)
+test(2010.3, rbindlist(list(DT1,DT2), use.names=TRUE), data.table(a=1:4, b=5:8))
+test(2010.4, rbindlist(list(DT1,DT2), use.names=NA), error="use.names=NA invalid")
+test(2010.5, rbindlist(list(DT1,DT2), use.names='check'),
+             error="use.names='check' cannot be used explicitly because the value 'check' is new in v1.12.2 and subject to change. It is just meant to convey default behavior.")
+
 
 ###################################
 #  Add new tests above this line  #
diff --git a/man/rbindlist.Rd b/man/rbindlist.Rd
index 1cbe4f5608..9b11c0b0cf 100644
--- a/man/rbindlist.Rd
+++ b/man/rbindlist.Rd
@@ -4,32 +4,26 @@
 \alias{rbind}
 \title{ Makes one data.table from a list of many }
 \description{
-  Same as \code{do.call("rbind", l)} on \code{data.frame}s, but much faster. See \code{DETAILS} for more.
+  Same as \code{do.call("rbind", l)} on \code{data.frame}s, but much faster.
 }
 \usage{
-rbindlist(l, use.names=fill, fill=FALSE, idcol=NULL)
+rbindlist(l, use.names="check", fill=FALSE, idcol=NULL)
 # rbind(\dots, use.names=TRUE, fill=FALSE, idcol=NULL)
 }
 \arguments{
-  \item{l}{ A list containing \code{data.table}, \code{data.frame} or \code{list} objects. At least one of the inputs should have column names set. \code{\dots} is the same but you pass the objects by name separately. }
-  \item{use.names}{If \code{TRUE} items will be bound by matching column names. By default \code{FALSE} for \code{rbindlist} (for backwards compatibility) and \code{TRUE} for \code{rbind} (consistency with base). Columns with duplicate names are bound in the order of occurrence, similar to base. When TRUE, at least one item of the input list has to have non-null column names.}
-  \item{fill}{If \code{TRUE} fills missing columns with NAs. By default \code{FALSE}. When \code{TRUE}, \code{use.names} has to be \code{TRUE}, and all items of the input list has to have non-null column names. }
-  \item{idcol}{Generates an index column. Default (\code{NULL}) is not to. If \code{idcol=TRUE} then the column is auto named \code{.id}. Alternatively the column name can be directly provided, e.g., \code{idcol = "id"}.
-
-  If input is a named list, ids are generated using them, else using integer vector from \code{1} to length of input list. See \code{examples}.}
+  \item{l}{ A list containing \code{data.table}, \code{data.frame} or \code{list} objects. \code{\dots} is the same but you pass the objects by name separately. }
+  \item{use.names}{\code{TRUE} binds by matching column name, \code{FALSE} by position. `check` (default) warns if all items don't have the same names in the same order and then currently proceeds as if `use.names=FALSE` for backwards compatibility (\code{TRUE} in future); see news for v1.12.2.}
+  \item{fill}{\code{TRUE} fills missing columns with NAs. By default \code{FALSE}. When \code{TRUE}, \code{use.names} is set to \code{TRUE}.}
+  \item{idcol}{Creates a column in the result showing which list item those rows came from. \code{TRUE} names this column \code{".id"}. \code{idcol="file"} names this column \code{"file"}. If the input list has names, those names are the values placed in this id column, otherwise the values are an integer vector \code{1:length(l)}. See \code{examples}.}
 }
 \details{
-Each item of \code{l} can be a \code{data.table}, \code{data.frame} or \code{list}, including \code{NULL} (skipped) or an empty object (0 rows). \code{rbindlist} is most useful when there are a variable number of (potentially many) objects to stack, such as returned by \code{lapply(fileNames, fread)}. \code{rbind} however is most useful to stack two or three objects which you know in advance. \code{\dots} should contain at least one \code{data.table} for \code{rbind(\dots)} to call the fast method and return a \code{data.table}, whereas \code{rbindlist(l)} always returns a \code{data.table} even when stacking a plain \code{list} with a \code{data.frame}, for example.
-
-In versions \code{<= v1.9.2}, each item for \code{rbindlist} should have the same number of columns as the first non empty item. \code{rbind.data.table} gained a \code{fill} argument to fill missing columns with \code{NA} in \code{v1.9.2}, which allowed for \code{rbind(\dots)} binding unequal number of columns.
-
-In version \code{> v1.9.2}, these functionalities were extended to \code{rbindlist} (and written entirely in C for speed). \code{rbindlist} has \code{use.names} argument, which is set to \code{FALSE} by default for backwards compatibility. It also contains \code{fill} argument as well and can bind unequal columns when set to \code{TRUE}.
+Each item of \code{l} can be a \code{data.table}, \code{data.frame} or \code{list}, including \code{NULL} (skipped) or an empty object (0 rows). \code{rbindlist} is most useful when there are an unknown number of (potentially many) objects to stack, such as returned by \code{lapply(fileNames, fread)}. \code{rbind} is most useful to stack two or three objects which you know in advance. \code{\dots} should contain at least one \code{data.table} for \code{rbind(\dots)} to call the fast method and return a \code{data.table}, whereas \code{rbindlist(l)} always returns a \code{data.table} even when stacking a plain \code{list} with a \code{data.frame}, for example.
 
-With these changes, the only difference between \code{rbind(\dots)} and \code{rbindlist(l)} is their \emph{default argument} \code{use.names}.
+Columns with duplicate names are bound in the order of occurrence, similar to base. The position (column number) that each duplicate name occurs is also retained.
 
-If column \code{i} of input items do not all have the same type; e.g, a \code{data.table} may be bound with a \code{list} or a column is \code{factor} while others are \code{character} types, they are coerced to the highest type (SEXPTYPE).
+If column \code{i} does not have the same type in each of the list items; e.g, the column is \code{integer} in item 1 while others are \code{numeric}, they are coerced to the highest type.
 
-Note that any additional attributes that might exist on individual items of the input list would not be preserved in the result.
+If a column contains factors then a factor is created. If any of the factors are also ordered factors then the longest set of ordered levels are found (the first if this is tied). Then the ordered levels from each list item are checked to be an ordered subset of these longest levels. If any ambiguities are found (e.g. \code{blue<green} vs \code{green<blue}), or any ordered levels are missing from the longest, then a regular factor is created with warning. Any strings in regular factor and character columns which are missing from the longest ordered levels are added at the end.
 }
 \value{
     An unkeyed \code{data.table} containing a concatenation of all the items passed in.
diff --git a/src/assign.c b/src/assign.c
index 0e516154ef..77f67d34cf 100644
--- a/src/assign.c
+++ b/src/assign.c
@@ -2,9 +2,6 @@
 #include <Rdefines.h>
 #include <Rmath.h>
 
-static SEXP *saveds=NULL;
-static R_len_t *savedtl=NULL, nalloc=0, nsaved=0;
-
 static void finalizer(SEXP p)
 {
   SEXP x;
@@ -136,7 +133,7 @@ static int _selfrefok(SEXP x, Rboolean checkNames, Rboolean verbose) {
     // because R copies the original vector's tl over despite allocating length.
   prot = R_ExternalPtrProtected(v);
   if (TYPEOF(prot) != EXTPTRSXP)   // Very rare. Was error(".internal.selfref prot is not itself an extptr").
-    return 0;                    // See http://stackoverflow.com/questions/15342227/getting-a-random-internal-selfref-error-in-data-table-for-r
+    return 0;                      // # nocov ; see http://stackoverflow.com/questions/15342227/getting-a-random-internal-selfref-error-in-data-table-for-r
   if (x != R_ExternalPtrAddr(prot))
     SET_TRUELENGTH(x, LENGTH(x));  // R copied this vector not data.table, it's not actually over-allocated
   return checkNames ? names==tag : x==R_ExternalPtrAddr(prot);
@@ -266,23 +263,13 @@ SEXP shallowwrapper(SEXP dt, SEXP cols) {
 }
 
 SEXP truelength(SEXP x) {
-  SEXP ans;
-  PROTECT(ans = allocVector(INTSXP, 1));
-  if (!isNull(x)) {
-     INTEGER(ans)[0] = TRUELENGTH(x);
-  } else {
-     INTEGER(ans)[0] = 0;
-  }
-  UNPROTECT(1);
-  return(ans);
+  return ScalarInteger(isNull(x) ? 0 : TRUELENGTH(x));
 }
 
 SEXP selfrefokwrapper(SEXP x, SEXP verbose) {
   return ScalarInteger(_selfrefok(x,FALSE,LOGICAL(verbose)[0]));
 }
 
-void memrecycle(SEXP target, SEXP where, int r, int len, SEXP source);
-
 SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP verb)
 {
   // For internal use only by := in [.data.table, and set()
@@ -341,10 +328,11 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
       error("i is type '%s'. Must be integer, or numeric is coerced with warning. If i is a logical subset, simply wrap with which(), and take the which() outside the loop if possible for efficiency.", type2char(TYPEOF(rows)));
     targetlen = length(rows);
     numToDo = 0;
+    const int *rowsd = INTEGER(rows);
     for (i=0; i<targetlen; i++) {
-      if ((INTEGER(rows)[i]<0 && INTEGER(rows)[i]!=NA_INTEGER) || INTEGER(rows)[i]>nrow)
-        error("i[%d] is %d which is out of range [1,nrow=%d].",i+1,INTEGER(rows)[i],nrow);
-      if (INTEGER(rows)[i]>=1) numToDo++;
+      if ((rowsd[i]<0 && rowsd[i]!=NA_INTEGER) || rowsd[i]>nrow)
+        error("i[%d] is %d which is out of range [1,nrow=%d].",i+1,rowsd[i],nrow);  // set() reaches here (test 2005.2); := reaches the same error in subset.c first
+      if (rowsd[i]>=1) numToDo++;
     }
     if (verbose) Rprintf("Assigning to %d row subset of %d rows\n", numToDo, nrow);
     // TODO: include in message if any rows are assigned several times (e.g. by=.EACHI with dups in i)
@@ -355,12 +343,12 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
     }
   }
   if (!length(cols)) {
-    warning("length(LHS)==0; no columns to delete or assign RHS to.");
+    warning("length(LHS)==0; no columns to delete or assign RHS to.");   // test 1295 covers
     return(dt);
   }
   // FR #2077 - set able to add new cols by reference
   if (isString(cols)) {
-    PROTECT(tmp = chmatch(cols, names, 0, FALSE));
+    PROTECT(tmp = chmatch(cols, names, 0));
     protecti++;
     buf = (int *) R_alloc(length(cols), sizeof(int));
     for (i=0; i<length(cols); i++) {
@@ -378,7 +366,7 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
     cols = tmp;
   } else {
     if (isReal(cols)) {
-      cols = PROTECT(cols = coerceVector(cols, INTSXP));
+      cols = PROTECT(coerceVector(cols, INTSXP));
       protecti++;
       warning("Coerced j from numeric to integer. Please pass integer for efficiency; e.g., 2L rather than 2");
     }
@@ -387,21 +375,12 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
   }
   if (any_duplicated(cols,FALSE)) error("Can't assign to the same column twice in the same query (duplicates detected).");
   if (!isNull(newcolnames) && !isString(newcolnames)) error("newcolnames is supplied but isn't a character vector");
-  if (isNull(values)) {
-    if (!length(cols)) {
-      warning("RHS is NULL, meaning delete columns(s). But, no columns in LHS to delete.");
-      return(dt);
-    }
-  } else {
-    if (TYPEOF(values)==VECSXP) {
-      if (length(cols)>1) {
-        if (length(values)==0) error("Supplied %d columns to be assigned an empty list (which may be an empty data.table or data.frame since they are lists too). To delete multiple columns use NULL instead. To add multiple empty list columns, use list(list()).", length(cols));
-        if (length(values)>length(cols))
-          warning("Supplied %d columns to be assigned a list (length %d) of values (%d unused)", length(cols), length(values), length(values)-length(cols));
-        else if (length(cols)%length(values) != 0)
-          warning("Supplied %d columns to be assigned a list (length %d) of values (recycled leaving remainder of %d items).",length(cols),length(values),length(cols)%length(values));
-      } // else it's a list() column being assigned to one column
-    }
+  if (TYPEOF(values)==VECSXP) {
+    if (length(cols)>1) {
+      if (length(values)==0) error("Supplied %d columns to be assigned an empty list (which may be an empty data.table or data.frame since they are lists too). To delete multiple columns use NULL instead. To add multiple empty list columns, use list(list()).", length(cols));
+      if (length(values)>1 && length(values)!=length(cols))
+        error("Supplied %d columns to be assigned %d items. Please see NEWS for v1.12.2.", length(cols), length(values));
+    } // else it's a list() column being assigned to one column
   }
   // Check all inputs :
   for (i=0; i<length(cols); i++) {
@@ -426,7 +405,7 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
       } else if (coln+1 > oldncol && TYPEOF(thisvalue)!=VECSXP) {  // list() is ok for new columns
         newcolnum = coln-length(names);
         if (newcolnum<0 || newcolnum>=length(newcolnames))
-          error("Internal logical error. length(newcolnames)=%d, length(names)=%d, coln=%d", length(newcolnames), length(names), coln);
+          error("Internal error in assign.c: length(newcolnames)=%d, length(names)=%d, coln=%d", length(newcolnames), length(names), coln); // # nocov
         if (isNull(thisvalue)) {
           warning("Adding new column '%s' then assigning NULL (deleting it).",CHAR(STRING_ELT(newcolnames,newcolnum)));
           continue;
@@ -455,13 +434,13 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
     if (oldtncol>oldncol+10000L) warning("truelength (%d) is greater than 10,000 items over-allocated (length = %d). See ?truelength. If you didn't set the datatable.alloccol option very large, please report to data.table issue tracker including the result of sessionInfo().",oldtncol, oldncol);
 
     if (oldtncol < oldncol+LENGTH(newcolnames))
-      error("Internal logical error. DT passed to assign has not been allocated enough column slots. l=%d, tl=%d, adding %d", oldncol, oldtncol, LENGTH(newcolnames));
+      error("Internal error: DT passed to assign has not been allocated enough column slots. l=%d, tl=%d, adding %d", oldncol, oldtncol, LENGTH(newcolnames));  // # nocov
     if (!selfrefnamesok(dt,verbose))
-      error("It appears that at some earlier point, names of this data.table have been reassigned. Please ensure to use setnames() rather than names<- or colnames<-. Otherwise, please report to data.table issue tracker.");
+      error("It appears that at some earlier point, names of this data.table have been reassigned. Please ensure to use setnames() rather than names<- or colnames<-. Otherwise, please report to data.table issue tracker.");  // # nocov
       // Can growVector at this point easily enough, but it shouldn't happen in first place so leave it as
       // strong error message for now.
     else if (TRUELENGTH(names) != oldtncol)
-      error("selfrefnames is ok but tl names [%d] != tl [%d]", TRUELENGTH(names), oldtncol);
+      error("Internal error: selfrefnames is ok but tl names [%d] != tl [%d]", TRUELENGTH(names), oldtncol);  // # nocov
     SETLENGTH(dt, oldncol+LENGTH(newcolnames));
     SETLENGTH(names, oldncol+LENGTH(newcolnames));
     for (i=0; i<LENGTH(newcolnames); i++)
@@ -486,13 +465,8 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
          (TYPEOF(values)!=VECSXP && i>0) // assigning the same values to a second column. Have to ensure a copy #2540
          ) {
         if (verbose) {
-          if (length(values)==length(cols)) {
-            // usual branch
-            Rprintf("RHS for item %d has been duplicated because NAMED is %d, but then is being plonked.\n", i+1, NAMED(thisvalue));
-          } else {
-            // rare branch where the lhs of := is longer than the items on the rhs of :=
-            Rprintf("RHS for item %d has been duplicated because the list of RHS values (length %d) is being recycled, but then is being plonked.\n", i+1, length(values));
-          }
+          Rprintf("RHS for item %d has been duplicated because NAMED is %d, but then is being plonked. length(values)==%d; length(cols)==%d)\n",
+                  i+1, NAMED(thisvalue), length(values), length(cols));
         }
         thisvalue = duplicate(thisvalue);   // PROTECT not needed as assigned as element to protected list below.
       } else {
@@ -577,7 +551,7 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
         } else {
           // value is either integer or numeric vector
           if (TYPEOF(thisvalue)!=INTSXP && TYPEOF(thisvalue)!=LGLSXP && !isReal(thisvalue))
-            error("Internal logical error. Up front checks (before starting to modify DT) didn't catch type of RHS ('%s') assigning to factor column '%s'. please report to data.table issue tracker.", type2char(TYPEOF(thisvalue)), CHAR(STRING_ELT(names,coln)));
+            error("Internal error: up front checks (before starting to modify DT) didn't catch type of RHS ('%s') assigning to factor column '%s'. please report to data.table issue tracker.", type2char(TYPEOF(thisvalue)), CHAR(STRING_ELT(names,coln))); // # nocov
           if (isReal(thisvalue) || TYPEOF(thisvalue)==LGLSXP) {
             PROTECT(RHS = coerceVector(thisvalue,INTSXP));
             protecti++;
@@ -612,13 +586,13 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
             char *s1 = (char *)type2char(TYPEOF(targetcol));
             char *s2 = (char *)type2char(TYPEOF(thisvalue));
             // FR #2551, added test for equality between RHS and thisvalue to not provide the warning when length(thisvalue) == 1
-            if ( length(thisvalue)==1 && TYPEOF(RHS)!=VECSXP && TYPEOF(thisvalue)!=VECSXP && (
+            if ( length(thisvalue)==1 && TYPEOF(RHS)!=VECSXP && (
                  ( isReal(thisvalue) && isInteger(targetcol) && REAL(thisvalue)[0]==INTEGER(RHS)[0] ) ||   // DT[,intCol:=4] rather than DT[,intCol:=4L]
                  ( isLogical(thisvalue) && LOGICAL(thisvalue)[0] == NA_LOGICAL ) ||                        // DT[,intCol:=NA]
-                 ( isReal(targetcol) && isInteger(thisvalue) ) )) {
+                 ( isInteger(thisvalue) && isReal(targetcol) ) )) {
               if (verbose) Rprintf("Coerced length-1 RHS from %s to %s to match column's type.%s If this assign is happening a lot inside a loop, in particular via set(), then it may be worth avoiding this coercion by using R's type postfix on the value being assigned; e.g. typeof(0) vs typeof(0L), and typeof(NA) vs typeof(NA_integer_) vs typeof(NA_real_).\n", s2, s1,
-                                    isInteger(targetcol) && isReal(thisvalue) ? "No precision was lost. " : "");
-              // TO DO: datatable.pedantic could turn this into warning
+                                    isInteger(targetcol) && isReal(thisvalue) ? " No precision was lost." : "");
+              // TO DO: datatable.pedantic could turn this into warning. Or we could catch and avoid the coerceVector allocation ourselves using a single int.
             } else {
               if (isReal(thisvalue) && isInteger(targetcol)) {
                 int w = INTEGER(isReallyReal(thisvalue))[0];  // first fraction present (1-based), 0 if none
@@ -646,7 +620,7 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
   if (length(key)) {
     // if assigning to at least one key column, the key is truncated to one position before the first changed column.
     //any() and subsetVector() don't seem to be exposed by R API at C level, so this is done here long hand.
-    PROTECT(tmp = chmatch(key, assignedNames, 0, TRUE));
+    PROTECT(tmp = chin(key, assignedNames));
     protecti++;
     newKeyLength = xlength(key);
     for (i=0;i<LENGTH(tmp);i++) if (LOGICAL(tmp)[i]) {
@@ -710,7 +684,7 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
         tc2 = CHAR(STRING_ELT(assignedNames, i));
         char *s5 = (char*) malloc(strlen(tc2) + 5); //4 * '_' + \0
         if(s5 == NULL){
-          free(s4);
+          free(s4);                                                  // # nocov
           error("Internal error: Couldn't allocate memory for s5."); // # nocov
         }
         memset(s5, '_', 2);
@@ -736,7 +710,7 @@ SEXP assign(SEXP dt, SEXP rows, SEXP cols, SEXP newcolnames, SEXP values, SEXP v
         }
       } else if(newKeyLength < strlen(c1)){
         if(indexLength == 0 && // shortened index can be kept since it is just information on the order (see #2372)
-           LOGICAL(chmatch(mkString(s4), indexNames, 0, TRUE))[0] == 0 ){// index with shortened name not present yet
+           LOGICAL(chin(mkString(s4), indexNames))[0] == 0) {// index with shortened name not present yet
           SET_TAG(s, install(s4));
           SET_STRING_ELT(indexNames, indexNo, mkChar(s4));
           if (verbose) {
@@ -812,19 +786,20 @@ static bool anyNamed(SEXP x) {
   return false;
 }
 
-void memrecycle(SEXP target, SEXP where, int start, int len, SEXP source)
+static char memrecycle_message[1000];
+
+const char *memrecycle(SEXP target, SEXP where, int start, int len, SEXP source)
 // like memcpy but recycles single-item source
 // 'where' a 1-based INTEGER vector subset of target to assign to, or NULL or integer()
 // assigns to target[start:start+len-1] or target[where[start:start+len-1]] where start is 0-based
 {
-  if (len<1) return;
-  if (TYPEOF(target) != TYPEOF(source)) error("Internal error: TYPEOF(target)['%s']!=TYPEOF(source)['%s']", type2char(TYPEOF(target)),type2char(TYPEOF(source))); // # nocov
+  if (len<1) return NULL;
   int slen = length(source);
-  if (slen==0) return;
+  if (slen==0) return NULL;
   if (slen>1 && slen!=len) error("Internal error: recycle length error not caught earlier. slen=%d len=%d", slen, len); // # nocov
   // Internal error because the column has already been added to the DT, so length mismatch should have been caught before adding the column.
   // for 5647 this used to limit slen to len, but no longer
-
+  *memrecycle_message = '\0';
   int protecti=0;
   if (isNewList(source)) {
     // A list() column; i.e. target is a column of pointers to SEXPs rather than the much more common case
@@ -844,28 +819,91 @@ void memrecycle(SEXP target, SEXP where, int start, int len, SEXP source)
       protecti++;
     }
   }
-  if (!length(where)) {
+  if (!length(where)) {  // e.g. called from rbindlist with where=R_NilValue
     switch (TYPEOF(target)) {
-    case LGLSXP: case INTSXP :
+    case RAWSXP:
+      if (TYPEOF(source)!=RAWSXP) { source = PROTECT(coerceVector(source, RAWSXP)); protecti++; }
       if (slen==1) {
         // recycle single items
-        int *td = INTEGER(target);
+        Rbyte *td = RAW(target)+start;
+        const Rbyte val = RAW(source)[0];
+        for (int i=0; i<len; i++) td[i] = val;  // no R API inside loop as RAW()/INTEGER() etc have overhead even when inline functions
+      } else {
+        memcpy(RAW(target)+start, RAW(source), slen*SIZEOF(target));
+      }
+      break;
+    case LGLSXP: case INTSXP :
+      if (TYPEOF(source)!=LGLSXP && TYPEOF(source)!=INTSXP) { source = PROTECT(coerceVector(source, TYPEOF(target))); protecti++; }
+      if (slen==1) {
+        int *td = INTEGER(target)+start;
         const int val = INTEGER(source)[0];
-        for (int i=0; i<len; i++) td[start+i] = val;  // no R API inside loop as INTEGER has overhead (even when it's an inline function)
+        for (int i=0; i<len; i++) td[i] = val;
       } else {
         memcpy(INTEGER(target)+start, INTEGER(source), slen*SIZEOF(target));
       }
       break;
-    case REALSXP :
+    case REALSXP : {
+      bool si64 = INHERITS(source, char_integer64);
+      bool ti64 = INHERITS(target, char_integer64);
+      if (si64 && TYPEOF(source)!=REALSXP)
+        error("Internal error: source has integer64 attribute but is type '%s' not REALSXP", type2char(TYPEOF(source))); // # nocov
+      if (si64 == ti64) {
+        if (TYPEOF(source)!=REALSXP) { source = PROTECT(coerceVector(source, REALSXP)); protecti++; }
+        if (slen==1) {
+          double *td = REAL(target)+start;
+          const double val = REAL(source)[0];
+          for (int i=0; i<len; i++) td[i] = val;
+        } else {
+          memcpy(REAL(target)+start, REAL(source), slen*SIZEOF(target));
+        }
+      } else if (si64) {
+        error("Internal error: memrecycle source is integer64 but target is real and not integer64; target should be type integer64");  // # nocov
+        /*
+        double *td = REAL(target)+start;
+        if (slen==1) {
+          const double val = (double)(((int64_t *)REAL(source))[0]);
+          for (int i=0; i<len; i++) td[i] = val;
+        } else {
+          const int64_t *val = (int64_t *)REAL(source);
+          for (int i=0; i<len; i++) td[i] = (double)(val[i]);
+        }*/
+      } else {
+        int64_t *td = (int64_t *)REAL(target)+start;
+        const int mask = slen==1 ? 0 : INT_MAX;
+        switch (TYPEOF(source)) {
+        case RAWSXP: {
+          const Rbyte *sd = RAW(source);  // sd = source data
+          for (int i=0; i<len; ++i) td[i] = (int64_t)(sd[i&mask]);  // raw has no NA
+        } break;
+        case LGLSXP : case INTSXP : {
+          const int *sd = INTEGER(source);
+          for (int i=0; i<len; ++i) td[i] = sd[i]==NA_INTEGER ? INT64_MIN : (int64_t)(sd[i]);
+        } break;
+        case REALSXP : {
+          int firstReal=0;
+          if ((firstReal=INTEGER(isReallyReal(source))[0])) {
+            sprintf(memrecycle_message, "coerced to integer64 but contains a non-integer value (%f at position %d); precision lost.", REAL(source)[firstReal-1], firstReal);
+          }
+          double *sd = REAL(source);
+          for (int i=0; i<len; ++i) td[i] = R_FINITE(sd[i]) ? (int)(sd[i]) : NA_INTEGER;
+        } break;
+        default :
+          error("Internal error: memrecycle integer64 column source is type '%s'", type2char(TYPEOF(source)));  // # nocov
+        }
+      }
+    } break;
+    case CPLXSXP :
+      if (TYPEOF(source)!=CPLXSXP) { source = PROTECT(coerceVector(source, CPLXSXP)); protecti++; }
       if (slen==1) {
-        double *td = REAL(target);
-        const double val = REAL(source)[0];
-        for (int i=0; i<len; i++) td[start+i] = val;
+        Rcomplex *td = COMPLEX(target)+start;
+        const Rcomplex val = COMPLEX(source)[0];
+        for (int i=0; i<len; ++i) td[i] = val;
       } else {
-        memcpy(REAL(target)+start, REAL(source), slen*SIZEOF(target));
+        memcpy(COMPLEX(target)+start, COMPLEX(source), slen*SIZEOF(target));
       }
       break;
     case STRSXP :
+      if (TYPEOF(source)!=STRSXP) { source = PROTECT(coerceVector(source, STRSXP)); protecti++; }
       if (slen==1) {
         const SEXP val = STRING_ELT(source, 0);
         for (int i=0; i<len; i++) SET_STRING_ELT(target, start+i, val);
@@ -875,6 +913,7 @@ void memrecycle(SEXP target, SEXP where, int start, int len, SEXP source)
       }
       break;
     case VECSXP :
+      if (TYPEOF(source)!=VECSXP) { source = PROTECT(coerceVector(source, VECSXP)); protecti++; }
       if (slen==1) {
         const SEXP val = VECTOR_ELT(source, 0);
         for (int i=0; i<len; i++) SET_VECTOR_ELT(target, start+i, val);
@@ -884,9 +923,11 @@ void memrecycle(SEXP target, SEXP where, int start, int len, SEXP source)
       }
       break;
     default :
-      error("Unsupported type '%s'", type2char(TYPEOF(target)));
+      error("Unsupported type in assign.c:memrecycle '%s' (no where)", type2char(TYPEOF(target)));  // # nocov
     }
   } else {
+    if (TYPEOF(target) != TYPEOF(source))
+      error("Internal error: TYPEOF(target)['%s']!=TYPEOF(source)['%s'] in memrecycle (where)", type2char(TYPEOF(target)),type2char(TYPEOF(source))); // # nocov
     const int *wd = INTEGER(where)+start;
     const int mask = slen==1 ? 0 : INT_MAX;
     switch (TYPEOF(target)) {
@@ -925,75 +966,102 @@ void memrecycle(SEXP target, SEXP where, int start, int len, SEXP source)
       }
     } break;
     default :
-      error("Unsupported type '%s'", type2char(TYPEOF(target)));
+      error("Unsupported type in assign.c:memrecycle '%s' (where)", type2char(TYPEOF(target)));  // # nocov
     }
   }
   UNPROTECT(protecti);
+  return memrecycle_message[0] ? memrecycle_message : NULL;
 }
 
-SEXP allocNAVector(SEXPTYPE type, R_len_t n)
+void writeNA(SEXP v, const int from, const int n)
+// this is for use after allocVector() which does not initialize its result. It does write NA as you'd
+// think, other than for VECSXP which allocVector() already initializes with NULL.
 {
-  // an allocVector following with initialization to NA since a subassign to a new column using :=
-  // routinely leaves untouched items (rather than 0 or "" as allocVector does with its memset)
-  // We guess that author of allocVector would have liked to initialize with NA but was prevented since memset
-  // is restricted to one byte.
-  SEXP v = PROTECT(allocVector(type, n));
-  switch(type) {
+  const int to = from-1+n;  // together with <=to below with writing NA to position 2147483647 in mind
+  switch(TYPEOF(v)) {
+  case RAWSXP:
+    memset(RAW(v)+from, 0, n*SIZEOF(v));
+    break;
   case LGLSXP : {
     Rboolean *vd = (Rboolean *)LOGICAL(v);
-    for (int i=0; i<n; i++) vd[i] = NA_LOGICAL;
+    for (int i=from; i<=to; ++i) vd[i] = NA_LOGICAL;
   } break;
   case INTSXP : {
+    // same whether factor or not
     int *vd = INTEGER(v);
-    for (int i=0; i<n; i++) vd[i] = NA_INTEGER;
+    for (int i=from; i<=to; ++i) vd[i] = NA_INTEGER;
   } break;
   case REALSXP : {
-    double *vd = REAL(v);
-    for (int i=0; i<n; i++) vd[i] = NA_REAL;
+    if (INHERITS(v, char_integer64)) {
+      int64_t *vd = (int64_t *)REAL(v);
+      for (int i=from; i<=to; ++i) vd[i] = INT64_MIN;
+    } else {
+      double *vd = REAL(v);
+      for (int i=from; i<=to; ++i) vd[i] = NA_REAL;
+    }
   } break;
   case STRSXP :
     // character columns are initialized with blank string (""). So replace the all-"" with all-NA_character_
     // Since "" and NA_character_ are global constants in R, it should be ok to not use SET_STRING_ELT here. But use it anyway for safety (revisit if proved slow)
-    for (int i=0; i<n; i++) SET_STRING_ELT(v, i, NA_STRING);
+    // If there's ever a way added to R API to pass NA_STRING to allocVector() to tell it to initialize with NA not "", would be great
+    for (int i=from; i<=to; ++i) SET_STRING_ELT(v, i, NA_STRING);
     break;
   case VECSXP :
     // list columns already have each item initialized to NULL
     break;
   default :
-    error("Unsupported type '%s'", type2char(type));
+    error("Internal error: writeNA passed a vector of type '%s'", type2char(TYPEOF(v)));  // # nocov
   }
+}
+
+SEXP allocNAVector(SEXPTYPE type, R_len_t n)
+{
+  // an allocVector following with initialization to NA since a subassign to a new column using :=
+  // routinely leaves untouched items (rather than 0 or "" as allocVector does with its memset)
+  // We guess that author of allocVector would have liked to initialize with NA but was prevented since memset
+  // is restricted to one byte.
+  SEXP v = PROTECT(allocVector(type, n));
+  writeNA(v, 0, n);
   UNPROTECT(1);
   return(v);
 }
 
+static SEXP *saveds=NULL;
+static R_len_t *savedtl=NULL, nalloc=0, nsaved=0;
+
 void savetl_init() {
-  if (nsaved || nalloc || saveds || savedtl) error("Internal error: savetl_init checks failed (%d %d %p %p). please report to data.table issue tracker.", nsaved, nalloc, saveds, savedtl); // # nocov
+  if (nsaved || nalloc || saveds || savedtl) {
+    error("Internal error: savetl_init checks failed (%d %d %p %p). please report to data.table issue tracker.", nsaved, nalloc, saveds, savedtl); // # nocov
+  }
   nsaved = 0;
   nalloc = 100;
   saveds = (SEXP *)malloc(nalloc * sizeof(SEXP));
-  if (saveds == NULL) error("Couldn't allocate saveds in savetl_init");
   savedtl = (R_len_t *)malloc(nalloc * sizeof(R_len_t));
-  if (savedtl == NULL) {
-    free(saveds);
-    error("Couldn't allocate saveds in savetl_init");
+  if (saveds==NULL || savedtl==NULL) {
+    savetl_end();                                                        // # nocov
+    error("Failed to allocate initial %d items in savetl_init", nalloc); // # nocov
   }
 }
 
 void savetl(SEXP s)
 {
-  if (nsaved>=nalloc) {
-    nalloc *= 2;
-    char *tmp;
-    tmp = (char *)realloc(saveds, nalloc * sizeof(SEXP));
-    if (tmp == NULL) {
-      savetl_end();
-      error("Couldn't realloc saveds in savetl");
+  if (nsaved==nalloc) {
+    if (nalloc==INT_MAX) {
+      savetl_end();                                                                                                     // # nocov
+      error("Internal error: reached maximum %d items for savetl. Please report to data.table issue tracker.", nalloc); // # nocov
+    }
+    nalloc = nalloc>(INT_MAX/2) ? INT_MAX : nalloc*2;
+    char *tmp = (char *)realloc(saveds, nalloc*sizeof(SEXP));
+    if (tmp==NULL) {
+      // C spec states that if realloc() fails the original block is left untouched; it is not freed or moved. We rely on that here.
+      savetl_end();                                                      // # nocov  free(saveds) happens inside savetl_end
+      error("Failed to realloc saveds to %d items in savetl", nalloc);   // # nocov
     }
     saveds = (SEXP *)tmp;
-    tmp = (char *)realloc(savedtl, nalloc * sizeof(R_len_t));
-    if (tmp == NULL) {
-      savetl_end();
-      error("Couldn't realloc savedtl in savetl");
+    tmp = (char *)realloc(savedtl, nalloc*sizeof(R_len_t));
+    if (tmp==NULL) {
+      savetl_end();                                                      // # nocov
+      error("Failed to realloc savedtl to %d items in savetl", nalloc);  // # nocov
     }
     savedtl = (R_len_t *)tmp;
   }
@@ -1006,11 +1074,11 @@ void savetl_end() {
   // Can get called if nothing has been saved yet (nsaved==0), or even if _init() hasn't been called yet (pointers NULL). Such
   // as to clear up before error. Also, it might be that nothing needed to be saved anyway.
   for (int i=0; i<nsaved; i++) SET_TRUELENGTH(saveds[i],savedtl[i]);
-  free(saveds);  // does nothing on NULL input
-  free(savedtl);
-  nsaved = nalloc = 0;
+  free(saveds);  // possible free(NULL) which is safe no-op
   saveds = NULL;
+  free(savedtl);
   savedtl = NULL;
+  nsaved = nalloc = 0;
 }
 
 SEXP setcharvec(SEXP x, SEXP which, SEXP newx)
diff --git a/src/chmatch.c b/src/chmatch.c
index 127578ba15..a3e141f4e6 100644
--- a/src/chmatch.c
+++ b/src/chmatch.c
@@ -3,7 +3,7 @@
 #define ENC_KNOWN(x) (LEVELS(x) & 76)
 // LATIN1_MASK (1<<2) | UTF8_MASK (1<<3) | ASCII_MASK (1<<6)
 
-SEXP match_logical(SEXP table, SEXP x) {
+static SEXP match_logical(SEXP table, SEXP x) {
   R_len_t i;
   SEXP ans, m;
   ans = PROTECT(allocVector(LGLSXP, length(x)));
@@ -14,15 +14,18 @@ SEXP match_logical(SEXP table, SEXP x) {
   return(ans);
 }
 
-SEXP chmatch(SEXP x, SEXP table, R_len_t nomatch, Rboolean in) {
-  R_len_t i, m;
-  SEXP ans, s;
+static SEXP chmatchMain(SEXP x, SEXP table, int nomatch, bool chin, bool chmatchdup) {
   if (!isString(x) && !isNull(x)) error("x is type '%s' (must be 'character' or NULL)", type2char(TYPEOF(x)));
   if (!isString(table) && !isNull(table)) error("table is type '%s' (must be 'character' or NULL)", type2char(TYPEOF(table)));
-  PROTECT(ans = allocVector(in ? LGLSXP : INTSXP,length(x))); // if fails, it fails before savetl
+  if (chin && chmatchdup) error("Internal error: either chin or chmatchdup should be true not both");  // # nocov
+  // allocations up front before savetl starts
+  SEXP ans = PROTECT(allocVector(chin?LGLSXP:INTSXP, length(x)));
+  int *ansd = INTEGER(ans);
   savetl_init();
-  for (i=0; i<length(x); i++) {
-    s = STRING_ELT(x,i);
+  const SEXP *xd = STRING_PTR(x);
+  const int xlen = length(x);
+  for (int i=0; i<xlen; i++) {
+    SEXP s = xd[i];
     if (s != NA_STRING && ENC_KNOWN(s) != 64) { // PREV: s != NA_STRING && !ENC_KNOWN(s) - changed to fix for bug #5159. The previous fix
                        // dealt with UNKNOWN encodings. But we could have the same string, where both are in different
                        // encodings than ASCII (ex: UTF8 and Latin1). To fix this, we'll to resort to 'match' if not ASCII.
@@ -37,45 +40,114 @@ SEXP chmatch(SEXP x, SEXP table, R_len_t nomatch, Rboolean in) {
       // since match() considers the same string in different encodings as equal (but slower). See #2538 and #4818.
       savetl_end();
       UNPROTECT(1);
-      return (in ? match_logical(table, x) : match(table, x, nomatch));
+      return (chin ? match_logical(table, x) : match(table, x, nomatch));
     }
     if (TRUELENGTH(s)>0) savetl(s);
     // as from v1.8.0 we assume R's internal hash is positive. So in R < 2.14.0 we
     // don't save the uninitialised truelengths that by chance are negative, but
     // will save if positive. Hence R >= 2.14.0 may be faster and preferred now that R
     // initializes truelength to 0 from R 2.14.0.
-    SET_TRUELENGTH(s,0);
+    SET_TRUELENGTH(s,0);   // TODO: do we need to set to zero first (we can rely on R 3.1.0 now)?
   }
-  for (i=length(table)-1; i>=0; i--) {
-    s = STRING_ELT(table,i);
+  const int tablelen = length(table);
+  const SEXP *td = STRING_PTR(table);
+  int nuniq=0;
+  for (int i=0; i<tablelen; ++i) {
+    SEXP s = td[i];
     if (s != NA_STRING && ENC_KNOWN(s) != 64) { // changed !ENC_KNOWN(s) to !ASCII(s) - check above for explanation
-      for (int j=i+1; j<LENGTH(table); j++) SET_TRUELENGTH(STRING_ELT(table,j),0);  // reinstate 0 rather than leave the -i-1
+      // This branch is now covered by tests 2004.*. However, is this branch redundant? It means there were no non-ascii encodings
+      // in x since the fallback above to match() didn't happen when x was checked. If that was the case in x, then it means none of the
+      // x values can match to table anyway. Can't we just drop this branch then? (TODO)
+      for (int j=0; j<i; ++j) SET_TRUELENGTH(td[j],0);  // reinstate 0 rather than leave the -i-1
       savetl_end();
       UNPROTECT(1);
-      return (in ? match_logical(table, x) : match(table, x, nomatch));
+      return (chin ? match_logical(table, x) : match(table, x, nomatch));
     }
-    if (TRUELENGTH(s)>0) savetl(s);
-    SET_TRUELENGTH(s, -i-1);
+    int tl = TRUELENGTH(s);
+    if (tl>0) { savetl(s); tl=0; }
+    if (tl==0) SET_TRUELENGTH(s, chmatchdup ? -(++nuniq) : -i-1); // first time seen this string in table
   }
-  if (in) {
-    for (i=0; i<length(x); i++) {
-      LOGICAL(ans)[i] = TRUELENGTH(STRING_ELT(x,i))<0;
-      // nomatch ignored for logical as base does I think
+  if (chmatchdup && nuniq<tablelen) {
+    // chmatchdup() is basically base::pmatch() but without the partial matching part. For example :
+    //   chmatchdup(c("a", "a"), c("a", "a"))   # 1,2  - the second 'a' in 'x' has a 2nd match in 'table'
+    //   chmatchdup(c("a", "a"), c("a", "b"))   # 1,NA - the second one doesn't 'see' the first 'a'
+    //   chmatchdup(c("a", "a"), c("a", "a.1")) # 1,NA - differs from 'pmatch' output = 1,2
+    // used to be called chmatch2 before v1.12.2 and was in rbindlist.c. New implementation from 1.12.2 here in chmatch.c
+    // if nuniq==tablelen then there are no dups and the simpler chmatch branch below happens instead to avoid these allocations etc
+    // see end of file for benchmark
+    //                                                                                        uniq         dups
+    // For example: A,B,C,B,D,E,A,A   =>   A(TL=1),B(2),C(3),D(4),E(5)   =>   dupMap    1  2  3  5  6 | 8  7  4
+    //                                                                        dupLink   7  8          |    6     (blank=0)
+    int *counts = (int *)calloc(nuniq, sizeof(int));
+    int *map =    (int *)calloc(tablelen+nuniq, sizeof(int));  // +nuniq to store a 0 at the end of each group
+    if (!counts || !map) {
+      // # nocov start
+      for (int i=0; i<tablelen; i++) SET_TRUELENGTH(td[i], 0);
+      savetl_end();
+      error("Failed to allocate %lld bytes working memory in chmatchdup: length(table)=%d length(unique(table))=%d", (tablelen*2+nuniq)*sizeof(int), tablelen, nuniq);
+      // # nocov end
+    }
+    for (int i=0; i<tablelen; ++i) counts[-TRUELENGTH(td[i])-1]++;
+    for (int i=0, sum=0; i<nuniq; ++i) { int tt=counts[i]; counts[i]=sum; sum+=tt+1; }
+    for (int i=0; i<tablelen; ++i) map[counts[-TRUELENGTH(td[i])-1]++] = i+1;           // 0 is left ending each group thanks to the calloc
+    for (int i=0, last=0; i<nuniq; ++i) {int tt=counts[i]+1; counts[i]=last; last=tt;}  // rewind counts to the beginning of each group
+    for (int i=0; i<xlen; ++i) {
+      int u = TRUELENGTH(xd[i]);
+      if (u<0) {
+        int w = counts[-u-1]++;
+        if (map[w]) { ansd[i]=map[w]; continue; }
+        SET_TRUELENGTH(xd[i],0); // w falls on ending 0 marker: dups used up; any more dups should return nomatch
+        // we still need the 0-setting loop at the end of this function because often there will be some values in table that are not matched to at all.
+      }
+      ansd[i] = nomatch;
+    }
+    free(counts);
+    free(map);
+  } else if (chin) {
+    for (int i=0; i<xlen; i++) {
+      ansd[i] = TRUELENGTH(xd[i])<0;
     }
   } else {
-    for (i=0; i<length(x); i++) {
-      m = TRUELENGTH(STRING_ELT(x,i));
-      INTEGER(ans)[i] = (m<0) ? -m : nomatch;
+    for (int i=0; i<xlen; i++) {
+      int m = TRUELENGTH(xd[i]);
+      ansd[i] = (m<0) ? -m : nomatch;
     }
   }
-  for (i=0; i<length(table); i++)
-    SET_TRUELENGTH(STRING_ELT(table,i),0);  // reinstate 0 rather than leave the -i-1
+  for (int i=0; i<tablelen; i++)
+    SET_TRUELENGTH(td[i], 0);  // reinstate 0 rather than leave the -i-1
   savetl_end();
   UNPROTECT(1);
   return(ans);
 }
 
-SEXP chmatchwrapper(SEXP x, SEXP table, SEXP nomatch, SEXP in) {
-  return(chmatch(x,table,INTEGER(nomatch)[0],LOGICAL(in)[0]));
+// for internal use from C :
+SEXP chmatch(SEXP x, SEXP table, int nomatch) {  // chin=  chmatchdup=
+  return chmatchMain(x, table, nomatch,             false, false);
+}
+SEXP chin(SEXP x, SEXP table) {
+  return chmatchMain(x, table, 0,                   true,  false);
+}
+
+// for use from internals at R level; chmatch and chin are exported too but not chmatchdup yet
+SEXP chmatch_R(SEXP x, SEXP table, SEXP nomatch) {
+  return chmatchMain(x, table, INTEGER(nomatch)[0], false, false);
+}
+SEXP chin_R(SEXP x, SEXP table) {
+  return chmatchMain(x, table, 0,                   true,  false);
 }
+SEXP chmatchdup_R(SEXP x, SEXP table, SEXP nomatch) {
+  return chmatchMain(x, table, INTEGER(nomatch)[0], false, true);
+}
+
+/*
+## Benchmark moved here in v1.12.2 from rbindlist.c
+set.seed(45L)
+x <- sample(letters, 1e6, TRUE)
+y <- sample(letters, 1e7, TRUE)
+system.time(ans0 <- base::pmatch(x,y,0L))           # over 5 minutes as of R 3.5.3 (March 2019)
+system.time(ans1 <- .Call("Cchmatch2_old", x,y,0L)) # 2.40sec  many years old
+system.time(ans2 <- .Call("Cchmatch2", x,y,0L))     # 0.17sec  as of 1.12.0 and in place for several years before that
+system.time(ans3 <- chmatchdup(x,y,0L))             # 0.09sec  from 1.12.2; but goal wasn't speed rather simplified code; e.g. rbindlist.c down from 960 to 360 lines
+identical(ans2,ans3)  # test 2000
+*/
 
diff --git a/src/data.table.h b/src/data.table.h
index 4f0c6bbc0e..a76c3ac63f 100644
--- a/src/data.table.h
+++ b/src/data.table.h
@@ -20,6 +20,7 @@ typedef R_xlen_t RLEN;
 #define IS_LATIN(x) (LEVELS(x) & 4)
 
 #define SIZEOF(x) sizes[TYPEOF(x)]
+#define TYPEORDER(x) typeorder[x]
 
 #ifdef MIN
 #undef MIN
@@ -56,7 +57,6 @@ typedef R_xlen_t RLEN;
 #endif
 
 // init.c
-void setSizes();
 SEXP char_integer64;
 SEXP char_ITime;
 SEXP char_IDate;
@@ -67,6 +67,8 @@ SEXP char_lens;
 SEXP char_indices;
 SEXP char_allLen1;
 SEXP char_allGrp1;
+SEXP char_factor;
+SEXP char_ordered;
 SEXP sym_sorted;
 SEXP sym_index;
 SEXP sym_BY;
@@ -82,10 +84,12 @@ long long NA_INT64_LL;
 SEXP keepattr(SEXP to, SEXP from);
 SEXP growVector(SEXP x, R_len_t newlen);
 size_t sizes[100];  // max appears to be FUNSXP = 99, see Rinternals.h
+size_t typeorder[100];
 SEXP SelfRefSymbol;
 
 // assign.c
 SEXP allocNAVector(SEXPTYPE type, R_len_t n);
+void writeNA(SEXP v, const int from, const int n);
 void savetl_init(), savetl(SEXP s), savetl_end();
 int checkOverAlloc(SEXP x);
 
@@ -111,7 +115,8 @@ SEXP uniqlist(SEXP l, SEXP order);
 SEXP uniqlengths(SEXP x, SEXP n);
 
 // chmatch.c
-SEXP chmatch(SEXP x, SEXP table, R_len_t nomatch, Rboolean in);
+SEXP chmatch(SEXP x, SEXP table, int nomatch);
+SEXP chin(SEXP x, SEXP table);
 
 SEXP isOrderedSubset(SEXP, SEXP);
 void setselfref(SEXP);
@@ -126,7 +131,7 @@ SEXP dt_na(SEXP x, SEXP cols);
 
 // assign.c
 SEXP alloccol(SEXP dt, R_len_t n, Rboolean verbose);
-void memrecycle(SEXP target, SEXP where, int r, int len, SEXP source);
+const char *memrecycle(SEXP target, SEXP where, int r, int len, SEXP source);
 SEXP shallowwrapper(SEXP dt, SEXP cols);
 
 SEXP dogroups(SEXP dt, SEXP dtcols, SEXP groups, SEXP grpcols, SEXP jiscols,
@@ -139,9 +144,6 @@ SEXP bmerge(SEXP iArg, SEXP xArg, SEXP icolsArg, SEXP xcolsArg, SEXP isorted,
                 SEXP xoArg, SEXP rollarg, SEXP rollendsArg, SEXP nomatchArg,
                 SEXP multArg, SEXP opArg, SEXP nqgrpArg, SEXP nqmaxgrpArg);
 
-// rbindlist.c
-SEXP combineFactorLevels(SEXP factorLevels, int *factorType, Rboolean *isRowOrdered);
-
 // quickselect
 double dquickselect(double *x, int n, int k);
 double iquickselect(int *x, int n, int k);
diff --git a/src/dogroups.c b/src/dogroups.c
index 25d7ddc2cc..e6fc531429 100644
--- a/src/dogroups.c
+++ b/src/dogroups.c
@@ -3,23 +3,6 @@
 #include <fcntl.h>
 #include <time.h>
 
-void setSizes() {
-  // called by init.c
-  int i;
-  for (i=0;i<100;i++) sizes[i]=0;
-  // only these types are currently allowed as column types :
-  sizes[INTSXP] = sizeof(int);     // integer and factor
-  sizes[LGLSXP] = sizeof(int);     // logical
-  sizes[REALSXP] = sizeof(double); // numeric
-  sizes[STRSXP] = sizeof(SEXP *);  // character
-  sizes[VECSXP] = sizeof(SEXP *);  // a column itself can be a list()
-  for (i=0;i<100;i++) {
-    if (sizes[i]>8) error("Type %d is sizeof() greater than 8 bytes on this machine. We haven't tested on any architecture greater than 64bit, yet.", i);
-    // One place we need the largest sizeof (assumed to be 8 bytes) is the working memory malloc in reorder.c
-  }
-  SelfRefSymbol = install(".internal.selfref");
-}
-
 SEXP dogroups(SEXP dt, SEXP dtcols, SEXP groups, SEXP grpcols, SEXP jiscols, SEXP xjiscols, SEXP grporder, SEXP order, SEXP starts, SEXP lens, SEXP jexp, SEXP env, SEXP lhs, SEXP newnames, SEXP on, SEXP verbose)
 {
   R_len_t i, j, k, rownum, ngrp, nrowgroups, njval=0, ngrpcols, ansloc=0, maxn, estn=-1, thisansloc, grpn, thislen, igrp, origIlen=0, origSDnrow=0;
diff --git a/src/fmelt.c b/src/fmelt.c
index f54ba17fb0..100bfed3bc 100644
--- a/src/fmelt.c
+++ b/src/fmelt.c
@@ -133,7 +133,7 @@ SEXP measurelist(SEXP measure, SEXP dtnames) {
   ans = PROTECT(allocVector(VECSXP, n)); protecti++;
   for (i=0; i<n; i++) {
     switch(TYPEOF(VECTOR_ELT(measure, i))) {
-      case STRSXP  : tmp = PROTECT(chmatch(VECTOR_ELT(measure, i), dtnames, 0, FALSE)); protecti++; break;
+      case STRSXP  : tmp = PROTECT(chmatch(VECTOR_ELT(measure, i), dtnames, 0)); protecti++; break;
       case REALSXP : tmp = PROTECT(coerceVector(VECTOR_ELT(measure, i), INTSXP)); protecti++; break;
       case INTSXP  : tmp = VECTOR_ELT(measure, i); break;
       default : error("Unknown 'measure.vars' type %s at index %d of list", type2char(TYPEOF(VECTOR_ELT(measure, i))), i+1);
@@ -185,7 +185,7 @@ SEXP checkVars(SEXP DT, SEXP id, SEXP measure, Rboolean verbose) {
     warning("To be consistent with reshape2's melt, id.vars and measure.vars are internally guessed when both are 'NULL'. All non-numeric/integer/logical type columns are considered id.vars, which in this case are columns [%s]. Consider providing at least one of 'id' or 'measure' vars in future.", CHAR(STRING_ELT(concat(dtnames, idcols), 0)));
   } else if (!isNull(id) && isNull(measure)) {
     switch(TYPEOF(id)) {
-      case STRSXP  : PROTECT(tmp = chmatch(id, dtnames, 0, FALSE)); protecti++; break;
+      case STRSXP  : PROTECT(tmp = chmatch(id, dtnames, 0)); protecti++; break;
       case REALSXP : PROTECT(tmp = coerceVector(id, INTSXP)); protecti++; break;
       case INTSXP  : tmp = id; break;
       default : error("Unknown 'id.vars' type %s, must be character or integer vector", type2char(TYPEOF(id)));
@@ -214,7 +214,7 @@ SEXP checkVars(SEXP DT, SEXP id, SEXP measure, Rboolean verbose) {
     }
   } else if (isNull(id) && !isNull(measure)) {
     switch(TYPEOF(measure)) {
-      case STRSXP  : tmp2 = PROTECT(chmatch(measure, dtnames, 0, FALSE)); protecti++; break;
+      case STRSXP  : tmp2 = PROTECT(chmatch(measure, dtnames, 0)); protecti++; break;
       case REALSXP : tmp2 = PROTECT(coerceVector(measure, INTSXP)); protecti++; break;
       case INTSXP  : tmp2 = measure; break;
       case VECSXP  : tmp2 = PROTECT(measurelist(measure, dtnames)); protecti++; break;
@@ -250,7 +250,7 @@ SEXP checkVars(SEXP DT, SEXP id, SEXP measure, Rboolean verbose) {
     }
   } else if (!isNull(id) && !isNull(measure)) {
     switch(TYPEOF(id)) {
-      case STRSXP  : tmp = PROTECT(chmatch(id, dtnames, 0, FALSE)); protecti++; break;
+      case STRSXP  : tmp = PROTECT(chmatch(id, dtnames, 0)); protecti++; break;
       case REALSXP : tmp = PROTECT(coerceVector(id, INTSXP)); protecti++; break;
       case INTSXP  : tmp = id; break;
       default : error("Unknown 'id.vars' type %s, must be character or integer vector", type2char(TYPEOF(id)));
@@ -261,7 +261,7 @@ SEXP checkVars(SEXP DT, SEXP id, SEXP measure, Rboolean verbose) {
     }
     idcols = PROTECT(tmp); protecti++;
     switch(TYPEOF(measure)) {
-      case STRSXP  : tmp2 = PROTECT(chmatch(measure, dtnames, 0, FALSE)); protecti++; break;
+      case STRSXP  : tmp2 = PROTECT(chmatch(measure, dtnames, 0)); protecti++; break;
       case REALSXP : tmp2 = PROTECT(coerceVector(measure, INTSXP)); protecti++; break;
       case INTSXP  : tmp2 = measure; break;
       case VECSXP  : tmp2 = PROTECT(measurelist(measure, dtnames)); protecti++; break;
@@ -350,6 +350,71 @@ static void preprocess(SEXP DT, SEXP id, SEXP measure, SEXP varnames, SEXP valna
   }
 }
 
+static SEXP combineFactorLevels(SEXP factorLevels, SEXP target, int * factorType, Rboolean * isRowOrdered)
+// Finds unique levels directly in one pass with no need to create hash tables. Creates integer factor
+// too in the same single pass. Previous version called factor(x, levels=unique) where x was type character
+// and needed hash table.
+// TODO keep the original factor columns as factor and use new technique in rbindlist.c. The calling
+// environments are a little difference hence postponed for now (e.g. rbindlist calls writeNA which
+// a general purpose combiner would need to know how many to write)
+// factorType is 1 for factor and 2 for ordered
+// will simply unique normal factors and attempt to find global order for ordered ones
+{
+  int maxlevels=0, nitem=length(factorLevels);
+  for (int i=0; i<nitem; ++i) {
+    SEXP this = VECTOR_ELT(factorLevels, i);
+    if (!isString(this)) error("Internal error: combineFactorLevels in fmelt.c expects all-character input");  // # nocov
+    maxlevels+=length(this);
+  }
+  if (!isString(target)) error("Internal error: combineFactorLevels in fmelt.c expects a character target to factorize");  // # nocov
+  int nrow = length(target);
+  SEXP ans = PROTECT(allocVector(INTSXP, nrow));
+  SEXP *levelsRaw = (SEXP *)R_alloc(maxlevels, sizeof(SEXP));  // allocate for worst-case all-unique levels
+  int *ansd = INTEGER(ans);
+  const SEXP *targetd = STRING_PTR(target);
+  savetl_init();
+  // no alloc or any fail point until savetl_end()
+  int nlevel=0;
+  for (int i=0; i<nitem; ++i) {
+    const SEXP this = VECTOR_ELT(factorLevels, i);
+    const SEXP *thisd = STRING_PTR(this);
+    const int thisn = length(this);
+    for (int k=0; k<thisn; ++k) {
+      SEXP s = thisd[k];
+      if (s==NA_STRING) continue;  // NA shouldn't be in levels but remove it just in case
+      int tl = TRUELENGTH(s);
+      if (tl<0) continue;  // seen this level before
+      if (tl>0) savetl(s);
+      SET_TRUELENGTH(s,-(++nlevel));
+      levelsRaw[nlevel-1] = s;
+    }
+  }
+  for (int i=0; i<nrow; ++i) {
+    if (targetd[i]==NA_STRING) {
+      *ansd++ = NA_INTEGER;
+    } else {
+      int tl = TRUELENGTH(targetd[i]);
+      *ansd++ = tl<0 ? -tl : NA_INTEGER;
+    }
+  }
+  for (int i=0; i<nlevel; ++i) SET_TRUELENGTH(levelsRaw[i], 0);
+  savetl_end();
+  // now after savetl_end, we can alloc (which might fail)
+  SEXP levelsSxp;
+  setAttrib(ans, R_LevelsSymbol, levelsSxp=allocVector(STRSXP, nlevel));
+  for (int i=0; i<nlevel; ++i) SET_STRING_ELT(levelsSxp, i, levelsRaw[i]);
+  if (*factorType==2) {
+    SEXP tt;
+    setAttrib(ans, R_ClassSymbol, tt=allocVector(STRSXP, 2));
+    SET_STRING_ELT(tt, 0, char_ordered);
+    SET_STRING_ELT(tt, 1, char_factor);
+  } else {
+    setAttrib(ans, R_ClassSymbol, ScalarString(char_factor));
+  }
+  UNPROTECT(1);
+  return ans;
+}
+
 SEXP getvaluecols(SEXP DT, SEXP dtnames, Rboolean valfactor, Rboolean verbose, struct processData *data) {
   int i, j, k, counter=0, thislen=0;
   SEXP thisvaluecols, ansvals, thisidx=R_NilValue, flevels;
@@ -466,10 +531,11 @@ SEXP getvaluecols(SEXP DT, SEXP dtnames, Rboolean valfactor, Rboolean verbose, s
       UNPROTECT(thisprotecti);  // inside inner loop (note that it's double loop) so as to limit use of protection stack
     }
     if (thisvalfactor && data->isfactor[i] && TYPEOF(target) != VECSXP) {
-      SEXP clevels = PROTECT(combineFactorLevels(flevels, &(data->isfactor[i]), isordered));
-      SEXP factorLangSxp = PROTECT(lang3(install(data->isfactor[i] == 1 ? "factor" : "ordered"), target, clevels));
-      SET_VECTOR_ELT(ansvals, i, eval(factorLangSxp, R_GlobalEnv));
-      UNPROTECT(2);  // clevels, factorLangSxp
+      //SEXP clevels = PROTECT(combineFactorLevels(flevels, &(data->isfactor[i]), isordered));
+      //SEXP factorLangSxp = PROTECT(lang3(install(data->isfactor[i] == 1 ? "factor" : "ordered"), target, clevels));
+      //SET_VECTOR_ELT(ansvals, i, eval(factorLangSxp, R_GlobalEnv));
+      //UNPROTECT(2);  // clevels, factorLangSxp
+      SET_VECTOR_ELT(ansvals, i, combineFactorLevels(flevels, target, &(data->isfactor[i]), isordered));
     }
   }
   UNPROTECT(2);  // flevels, ansvals. Not using two protection counters (protecti and thisprotecti) to keep rchk happy.
@@ -477,74 +543,82 @@ SEXP getvaluecols(SEXP DT, SEXP dtnames, Rboolean valfactor, Rboolean verbose, s
 }
 
 SEXP getvarcols(SEXP DT, SEXP dtnames, Rboolean varfactor, Rboolean verbose, struct processData *data) {
-
-  int i,j,k,cnt=0,nrows=0, nlevels=0, protecti=0, thislen, zerolen=0;
-  SEXP ansvars, thisvaluecols, levels, target, matchvals, thisnames;
-
-  ansvars = PROTECT(allocVector(VECSXP, 1)); protecti++;
-  SET_VECTOR_ELT(ansvars, 0, target=allocVector(INTSXP, data->totlen) );
-  if (data->lvalues == 1) {
-    thisvaluecols = VECTOR_ELT(data->valuecols, 0);
-    // tmp fix for #1055
-    thisnames = PROTECT(allocVector(STRSXP, length(thisvaluecols))); protecti++;
-    for (i=0; i<length(thisvaluecols); i++) {
-      SET_STRING_ELT(thisnames, i, STRING_ELT(dtnames, INTEGER(thisvaluecols)[i]-1));
-    }
-    matchvals = PROTECT(match(thisnames, thisnames, 0)); protecti++;
-    if (data->narm) {
-      for (j=0; j<data->lmax; j++) {
-        thislen = length(VECTOR_ELT(data->naidx, j));
-        for (k=0; k<thislen; k++)
-          INTEGER(target)[nrows + k] = INTEGER(matchvals)[j - zerolen]; // fix for #1359
-        nrows += thislen;
-        zerolen += (thislen == 0);
+  // reworked in PR#3455 to create character/factor directly for efficiency, and handle duplicates (#1754)
+  // data->nrow * data->lmax == data->totlen
+  SEXP ansvars=PROTECT(allocVector(VECSXP, 1));
+  int protecti=1;
+  SEXP target;
+  if (data->lvalues==1 && length(VECTOR_ELT(data->valuecols, 0)) != data->lmax)
+    error("Internal error: fmelt.c:getvarcols %d %d", length(VECTOR_ELT(data->valuecols, 0)), data->lmax);  // # nocov
+  if (!varfactor) {
+    SET_VECTOR_ELT(ansvars, 0, target=allocVector(STRSXP, data->totlen));
+    if (data->lvalues == 1) {
+      const int *thisvaluecols = INTEGER(VECTOR_ELT(data->valuecols, 0));
+      for (int j=0, ansloc=0; j<data->lmax; ++j) {
+        const int thislen = data->narm ? length(VECTOR_ELT(data->naidx, j)) : data->nrow;
+        SEXP str = STRING_ELT(dtnames, thisvaluecols[j]-1);
+        for (int k=0; k<thislen; ++k) SET_STRING_ELT(target, ansloc++, str);
       }
-      nlevels = data->lmax - zerolen;
     } else {
-      for (j=0; j<data->lmax; j++) {
-        for (k=0; k<data->nrow; k++)
-          INTEGER(target)[data->nrow*j + k] = INTEGER(matchvals)[j];
+      for (int j=0, ansloc=0, level=1; j<data->lmax; ++j) {
+        const int thislen = data->narm ? length(VECTOR_ELT(data->naidx, j)) : data->nrow;
+        if (thislen==0) continue;  // so as not to bump level
+        char buff[20];
+        sprintf(buff, "%d", level++);
+        SEXP str = PROTECT(mkChar(buff));
+        for (int k=0; k<thislen; ++k) SET_STRING_ELT(target, ansloc++, str);
+        UNPROTECT(1);
       }
-      nlevels = data->lmax;
     }
   } else {
-    if (data->narm) {
-      for (j=0; j<data->lmax; j++) {
-        thislen = length(VECTOR_ELT(data->naidx, j));
-        for (k=0; k<thislen; k++)
-          INTEGER(target)[nrows + k] = j+1;
-        nrows += thislen;
-        nlevels += (thislen != 0);
+    SET_VECTOR_ELT(ansvars, 0, target=allocVector(INTSXP, data->totlen));
+    SEXP levels;
+    int *td = INTEGER(target);
+    if (data->lvalues == 1) {
+      SEXP thisvaluecols = VECTOR_ELT(data->valuecols, 0);
+      int len = length(thisvaluecols);
+      levels = PROTECT(allocVector(STRSXP, len)); protecti++;
+      const int *vd = INTEGER(thisvaluecols);
+      for (int j=0; j<len; ++j) SET_STRING_ELT(levels, j, STRING_ELT(dtnames, vd[j]-1));
+      SEXP m = PROTECT(chmatch(levels, levels, 0)); protecti++;  // do we have any dups?
+      int numRemove = 0;  // remove dups and any for which narm and all-NA
+      int *md = INTEGER(m);
+      for (int j=0; j<len; ++j) {
+        if (md[j]!=j+1 /*dup*/ || (data->narm && length(VECTOR_ELT(data->naidx, j))==0)) { numRemove++; md[j]=0; }
+      }
+      if (numRemove) {
+        SEXP newlevels = PROTECT(allocVector(STRSXP, len-numRemove)); protecti++;
+        for (int i=0, loc=0; i<len; ++i) if (md[i]!=0) { SET_STRING_ELT(newlevels, loc++, STRING_ELT(levels, i)); }
+        m = PROTECT(chmatch(levels, newlevels, 0)); protecti++;  // budge up the gaps
+        md = INTEGER(m);
+        levels = newlevels;
+      }
+      for (int j=0, ansloc=0; j<data->lmax; ++j) {
+        const int thislen = data->narm ? length(VECTOR_ELT(data->naidx, j)) : data->nrow;
+        for (int k=0; k<thislen; ++k) td[ansloc++] = md[j];
       }
     } else {
-      for (j=0; j<data->lmax; j++) {
-        for (k=0; k<data->nrow; k++)
-          INTEGER(target)[data->nrow*j + k] = j+1;
+      int nlevel=0;
+      levels = PROTECT(allocVector(STRSXP, data->lmax)); protecti++;
+      for (int j=0, ansloc=0; j<data->lmax; ++j) {
+        const int thislen = data->narm ? length(VECTOR_ELT(data->naidx, j)) : data->nrow;
+        if (thislen==0) continue;  // so as not to bump level
+        char buff[20];
+        sprintf(buff, "%d", nlevel+1);
+        SET_STRING_ELT(levels, nlevel++, mkChar(buff));  // generate levels = 1:nlevels
+        for (int k=0; k<thislen; ++k) td[ansloc++] = nlevel;
       }
-      nlevels = data->lmax;
-    }
-  }
-  SEXP tmp = PROTECT(mkString("factor")); protecti++;
-  setAttrib(target, R_ClassSymbol, tmp);
-  cnt = 0;
-  if (data->lvalues == 1) {
-    levels = PROTECT(allocVector(STRSXP, nlevels)); protecti++;
-    thisvaluecols = VECTOR_ELT(data->valuecols, 0); // levels will be column names
-    for (i=0; i<data->lmax; i++) {
-      if (data->narm) {
-        if (length(VECTOR_ELT(data->naidx, i)) == 0) continue;
+      if (nlevel < data->lmax) {
+        // data->narm is true and there are some all-NA items causing at least one 'if (thislen==0) continue' above
+        // shrink the levels
+        SEXP newlevels = PROTECT(allocVector(STRSXP, nlevel)); protecti++;
+        for (int i=0; i<nlevel; ++i) SET_STRING_ELT(newlevels, i, STRING_ELT(levels, i));
+        levels = newlevels;
       }
-      SET_STRING_ELT(levels, cnt++, STRING_ELT(dtnames, INTEGER(thisvaluecols)[i]-1));
     }
-  } else {
-    SEXP tt = PROTECT(seq_int(nlevels, 1)); protecti++;      // generate levels = 1:nlevels
-    levels = PROTECT(coerceVector(tt, STRSXP)); protecti++;  // tt PROTECTED for rchk
+    setAttrib(target, R_LevelsSymbol, levels);
+    setAttrib(target, R_ClassSymbol, ScalarString(char_factor));
   }
-  // base::unique is fast on vectors, and the levels on variable columns are usually small
-  SEXP uniqueLangSxp = PROTECT(lang2(install("unique"), levels)); protecti++;
-  tmp = PROTECT(eval(uniqueLangSxp, R_GlobalEnv)); protecti++;
-  setAttrib(target, R_LevelsSymbol, tmp);
-  if (!varfactor) SET_VECTOR_ELT(ansvars, 0, asCharacterFactor(target));
   UNPROTECT(protecti);
   return(ansvars);
 }
diff --git a/src/freadR.c b/src/freadR.c
index b8de1d895a..380f08fb5a 100644
--- a/src/freadR.c
+++ b/src/freadR.c
@@ -206,7 +206,7 @@ _Bool userOverride(int8_t *type, lenOff *colNames, const char *anchor, int ncol)
     SEXP typeRName_sxp = PROTECT(allocVector(STRSXP, NUT));
     for (int i=0; i<NUT; i++) SET_STRING_ELT(typeRName_sxp, i, mkChar(typeRName[i]));
     if (isString(colClassesSxp)) {
-      SEXP typeEnum_idx = PROTECT(chmatch(colClassesSxp, typeRName_sxp, NUT, FALSE));
+      SEXP typeEnum_idx = PROTECT(chmatch(colClassesSxp, typeRName_sxp, NUT));
       if (LENGTH(colClassesSxp)==1) {
         signed char newType = typeEnum[INTEGER(typeEnum_idx)[0]-1];
         if (newType == CT_DROP) STOP("colClasses='NULL' is not permitted; i.e. to drop all columns and load nothing");
@@ -224,7 +224,7 @@ _Bool userOverride(int8_t *type, lenOff *colNames, const char *anchor, int ncol)
     } else {
       if (!isNewList(colClassesSxp)) STOP("CfreadR: colClasses is not type list");
       if (!length(getAttrib(colClassesSxp, R_NamesSymbol))) STOP("CfreadR: colClasses is type list but has no names");
-      SEXP typeEnum_idx = PROTECT(chmatch(PROTECT(getAttrib(colClassesSxp, R_NamesSymbol)), typeRName_sxp, NUT, FALSE));
+      SEXP typeEnum_idx = PROTECT(chmatch(PROTECT(getAttrib(colClassesSxp, R_NamesSymbol)), typeRName_sxp, NUT));
       for (int i=0; i<LENGTH(colClassesSxp); i++) {
         SEXP items;
         signed char thisType = typeEnum[INTEGER(typeEnum_idx)[i]-1];
@@ -239,7 +239,7 @@ _Bool userOverride(int8_t *type, lenOff *colNames, const char *anchor, int ncol)
           continue;
         }
         SEXP itemsInt;
-        if (isString(items)) itemsInt = PROTECT(chmatch(items, colNamesSxp, NA_INTEGER, FALSE));
+        if (isString(items)) itemsInt = PROTECT(chmatch(items, colNamesSxp, NA_INTEGER));
         else                 itemsInt = PROTECT(coerceVector(items, INTSXP));
         // UNPROTECTed directly just after this for loop. No protecti++ here is correct.
         for (int j=0; j<LENGTH(items); j++) {
@@ -258,7 +258,7 @@ _Bool userOverride(int8_t *type, lenOff *colNames, const char *anchor, int ncol)
         UNPROTECT(1); // UNPROTECTing itemsInt inside loop to save protection stack
       }
       for (int i=0; i<ncol; i++) if (type[i]<0) type[i] *= -1;  // undo sign; was used to detect duplicates
-      UNPROTECT(2);  // typeEnum_idx (+1 for its protect of getAttrib)
+      UNPROTECT(2);  // typeEnum_idx (+1 for its protect of getAttrib which rcheck asked for iirc)
     }
     UNPROTECT(1);  // typeRName_sxp
   }
@@ -267,7 +267,7 @@ _Bool userOverride(int8_t *type, lenOff *colNames, const char *anchor, int ncol)
   }
   if (length(dropSxp)) {
     SEXP itemsInt;
-    if (isString(dropSxp)) itemsInt = PROTECT(chmatch(dropSxp, colNamesSxp, NA_INTEGER, FALSE));
+    if (isString(dropSxp)) itemsInt = PROTECT(chmatch(dropSxp, colNamesSxp, NA_INTEGER));
     else                   itemsInt = PROTECT(coerceVector(dropSxp, INTSXP));
     for (int j=0; j<LENGTH(itemsInt); j++) {
       int k = INTEGER(itemsInt)[j];
@@ -294,7 +294,7 @@ _Bool userOverride(int8_t *type, lenOff *colNames, const char *anchor, int ncol)
     SEXP tt;
     if (isString(selectSxp)) {
       // invalid cols check part of #1445 moved here (makes sense before reading the file)
-      tt = PROTECT(chmatch(selectSxp, colNamesSxp, NA_INTEGER, FALSE));
+      tt = PROTECT(chmatch(selectSxp, colNamesSxp, NA_INTEGER));
       for (int i=0; i<length(selectSxp); i++) if (INTEGER(tt)[i]==NA_INTEGER)
         DTWARN("Column name '%s' not found in column name header (case sensitive), skipping.", CHAR(STRING_ELT(selectSxp, i)));
     } else {
diff --git a/src/init.c b/src/init.c
index ca23a2f9e7..5e8d168006 100644
--- a/src/init.c
+++ b/src/init.c
@@ -15,13 +15,14 @@ SEXP selfrefokwrapper();
 SEXP truelength();
 SEXP setcharvec();
 SEXP setcolorder();
-SEXP chmatchwrapper();
+SEXP chmatch_R();
+SEXP chmatchdup_R();
+SEXP chin_R();
 SEXP freadR();
 SEXP fwriteR();
 SEXP reorder();
 SEXP rbindlist();
 SEXP vecseq();
-SEXP copyattr();
 SEXP setlistelt();
 SEXP setmutable();
 SEXP address();
@@ -44,7 +45,6 @@ SEXP pointWrapper();
 SEXP setNumericRounding();
 SEXP getNumericRounding();
 SEXP binary();
-SEXP chmatch2();
 SEXP subsetDT();
 SEXP subsetVector();
 SEXP convertNegAndZeroIdx();
@@ -97,13 +97,14 @@ R_CallMethodDef callMethods[] = {
 {"Ctruelength", (DL_FUNC) &truelength, -1},
 {"Csetcharvec", (DL_FUNC) &setcharvec, -1},
 {"Csetcolorder", (DL_FUNC) &setcolorder, -1},
-{"Cchmatchwrapper", (DL_FUNC) &chmatchwrapper, -1},
+{"Cchmatch", (DL_FUNC) &chmatch_R, -1},
+{"Cchmatchdup", (DL_FUNC) &chmatchdup_R, -1},
+{"Cchin", (DL_FUNC) &chin_R, -1},
 {"CfreadR", (DL_FUNC) &freadR, -1},
 {"CfwriteR", (DL_FUNC) &fwriteR, -1},
 {"Creorder", (DL_FUNC) &reorder, -1},
 {"Crbindlist", (DL_FUNC) &rbindlist, -1},
 {"Cvecseq", (DL_FUNC) &vecseq, -1},
-{"Ccopyattr", (DL_FUNC) &copyattr, -1},
 {"Csetlistelt", (DL_FUNC) &setlistelt, -1},
 {"Csetmutable", (DL_FUNC) &setmutable, -1},
 {"Caddress", (DL_FUNC) &address, -1},
@@ -126,7 +127,6 @@ R_CallMethodDef callMethods[] = {
 {"CsetNumericRounding", (DL_FUNC) &setNumericRounding, -1},
 {"CgetNumericRounding", (DL_FUNC) &getNumericRounding, -1},
 {"Cbinary", (DL_FUNC) &binary, -1},
-{"Cchmatch2", (DL_FUNC) &chmatch2, -1},
 {"CsubsetDT", (DL_FUNC) &subsetDT, -1},
 {"CsubsetVector", (DL_FUNC) &subsetVector, -1},
 {"CconvertNegAndZeroIdx", (DL_FUNC) &convertNegAndZeroIdx, -1},
@@ -172,6 +172,20 @@ R_ExternalMethodDef externalMethods[] = {
 {NULL, NULL, 0}
 };
 
+static void setSizes() {
+  for (int i=0; i<100; ++i) { sizes[i]=0; typeorder[i]=0; }
+  // only these types are currently allowed as column types :
+  sizes[LGLSXP] =  sizeof(int);       typeorder[LGLSXP] =  0;
+  sizes[RAWSXP] =  sizeof(Rbyte);     typeorder[RAWSXP] =  1;
+  sizes[INTSXP] =  sizeof(int);       typeorder[INTSXP] =  2;   // integer and factor
+  sizes[REALSXP] = sizeof(double);    typeorder[REALSXP] = 3;   // numeric and integer64
+  sizes[CPLXSXP] = sizeof(Rcomplex);  typeorder[CPLXSXP] = 4;
+  sizes[STRSXP] =  sizeof(SEXP *);    typeorder[STRSXP] =  5;
+  sizes[VECSXP] =  sizeof(SEXP *);    typeorder[VECSXP] =  6;   // list column
+  if (sizeof(char *)>8) error("Pointers are %d bytes, greater than 8. We have not tested on any architecture greater than 64bit yet.", sizeof(char *));
+  // One place we need the largest sizeof is the working memory malloc in reorder.c
+}
+
 void attribute_visible R_init_datatable(DllInfo *info)
 // relies on pkg/src/Makevars to mv data.table.so to datatable.so
 {
@@ -249,11 +263,12 @@ void attribute_visible R_init_datatable(DllInfo *info)
   char_indices =   PRINTNAME(install("indices"));
   char_allLen1 =   PRINTNAME(install("allLen1"));
   char_allGrp1 =   PRINTNAME(install("allGrp1"));
+  char_factor =    PRINTNAME(install("factor"));
+  char_ordered =   PRINTNAME(install("ordered"));
 
   if (TYPEOF(char_integer64) != CHARSXP) {
     // checking one is enough in case of any R-devel changes
-    error("PRINTNAME(install(\"integer64\")) has returned %s not %s",
-      type2char(TYPEOF(char_integer64)), type2char(CHARSXP));
+    error("PRINTNAME(install(\"integer64\")) has returned %s not %s", type2char(TYPEOF(char_integer64)), type2char(CHARSXP));  // # nocov
   }
 
   // create commonly used symbols, same as R_*Symbol but internal to DT
@@ -267,6 +282,7 @@ void attribute_visible R_init_datatable(DllInfo *info)
   sym_index   = install("index");
   sym_BY      = install(".BY");
   sym_maxgrpn = install("maxgrpn");
+  SelfRefSymbol = install(".internal.selfref");
 
   initDTthreads();
   avoid_openmp_hang_within_fork();
@@ -312,9 +328,9 @@ inline double LLtoD(long long x) {
   return u.d;
 }
 
-
+// # nocov start
 SEXP hasOpenMP() {
-  // Just for use by onAttach to avoid an RPRINTF from C level which isn't suppressable by CRAN
+  // Just for use by onAttach (hence nocov) to avoid an RPRINTF from C level which isn't suppressable by CRAN
   // There is now a 'grep' in CRAN_Release.cmd to detect any use of RPRINTF in init.c, which is
   // why RPRINTF is capitalized in this comment to avoid that grep.
   // TODO: perhaps .Platform or .Machine in R itself could contain whether OpenMP is available.
@@ -324,6 +340,7 @@ SEXP hasOpenMP() {
   return ScalarLogical(FALSE);
   #endif
 }
+// # nocov end
 
 SEXP dllVersion() {
   // .onLoad calls this and checks the same as packageVersion() to ensure no R/C version mismatch, #3056
diff --git a/src/rbindlist.c b/src/rbindlist.c
index bec3fdd4e5..042c9c99bb 100644
--- a/src/rbindlist.c
+++ b/src/rbindlist.c
@@ -1,958 +1,468 @@
 #include "data.table.h"
 #include <Rdefines.h>
-#include <stdint.h>
-// #include <signal.h> // the debugging machinery + breakpoint aidee
-// raise(SIGINT);
 
-/* Eddi's hash setup for combining factor levels appropriately - untouched from previous state (except made combineFactorLevels static) */
-
-// a simple linked list, will use this when finding global order for ordered factors
-// will keep two ints
-struct llist {
-  struct llist * next;
-  R_len_t i, j;
-};
-
-// hash table code copied from main/unique.c, specialized for our particular needs
-// as our table will just be strings
-// took out long vector ifdefs as that relied on too much base code
-// can revisit this later if there is need for more than ~1e9 length factor columns
-// UTF8 and Cache bools are not set correctly for now
-
-typedef size_t hlen;
-
-/* Hash function and equality test for keys */
-typedef struct _HashData HashData;
-
-struct _HashData {
-  int K;
-  hlen M;
-  RLEN nmax;
-  hlen (*hash)(SEXP, RLEN, HashData *);
-  int (*equal)(SEXP, RLEN, SEXP, RLEN);
-  struct llist ** HashTable;
-
-  int nomatch;
-  Rboolean useUTF8;
-  Rboolean useCache;
-};
-
-/*
-Integer keys are hashed via a random number generator
-based on Knuth's recommendations. The high order K bits
-are used as the hash code.
-
-NB: lots of this code relies on M being a power of two and
-on silent integer overflow mod 2^32.
-
-<FIXME> Integer keys are wasteful for logical and raw vectors, but
-the tables are small in that case. It would be much easier to
-implement long vectors, though.
-*/
-
-/* Currently the hash table is implemented as a (signed) integer
-array. So there are two 31-bit restrictions, the length of the
-array and the values. The values are initially NIL (-1). O-based
-indices are inserted by isDuplicated, and invalidated by setting
-to NA_INTEGER.
-*/
-
-static hlen scatter(unsigned int key, HashData *d)
-{
-  return 3141592653U * key >> (32 - d->K);
-}
-
-/* Hash CHARSXP by address. Hash values are int, For 64bit pointers,
- * we do (upper ^ lower) */
-static hlen cshash(SEXP x, RLEN indx, HashData *d)
-{
-  intptr_t z = (intptr_t) STRING_ELT(x, indx);
-  unsigned int z1 = (unsigned int)(z & 0xffffffff), z2 = 0;
-#if SIZEOF_LONG == 8
-  z2 = (unsigned int)(z/0x100000000L);
-#endif
-  return scatter(z1 ^ z2, d);
-}
-
-static hlen shash(SEXP x, RLEN indx, HashData *d)
-{
-  unsigned int k;
-  const char *p;
-  const void *vmax = vmaxget();
-  if(!d->useUTF8 && d->useCache) return cshash(x, indx, d);
-  /* Not having d->useCache really should not happen anymore. */
-  p = translateCharUTF8(STRING_ELT(x, indx));
-  k = 0;
-  while (*p++)
-    k = 11 * k + (unsigned int) *p; /* was 8 but 11 isn't a power of 2 */
-  vmaxset(vmax); /* discard any memory used by translateChar */
-  return scatter(k, d);
-}
-
-static int sequal(SEXP x, RLEN i, SEXP y, RLEN j)
-{
-  // using our function instead of copying a lot more code from base
-  return !StrCmp(STRING_ELT(x, i), STRING_ELT(y, j));
-}
-
-/*
-Choose M to be the smallest power of 2
-not less than 2*n and set K = log2(M).
-Need K >= 1 and hence M >= 2, and 2^M < 2^31-1, hence n <= 2^29.
-
-Dec 2004: modified from 4*n to 2*n, since in the worst case we have
-a 50% full table, and that is still rather efficient -- see
-R. Sedgewick (1998) Algorithms in C++ 3rd edition p.606.
-*/
-static void MKsetup(HashData *d, RLEN n)
+SEXP rbindlist(SEXP l, SEXP usenamesArg, SEXP fillArg, SEXP idcolArg)
 {
-  if(n < 0 || n >= 1073741824) /* protect against overflow to -ve */
-    error("length %d is too large for hashing", n);
-
-  size_t n2 = 2U * (size_t) n;
-  d->M = 2;
-  d->K = 1;
-  while (d->M < n2) {
-    d->M *= 2;
-    d->K++;
+  if (!isLogical(fillArg) || LENGTH(fillArg) != 1 || LOGICAL(fillArg)[0] == NA_LOGICAL)
+    error("fill= should be TRUE or FALSE");
+  if (!isLogical(usenamesArg) || LENGTH(usenamesArg)!=1)
+    error("use.names= should be TRUE, FALSE, or not used (\"check\" by default)");  // R levels converts "check" to NA
+  if (!length(l)) return(l);
+  if (TYPEOF(l) != VECSXP) error("Input to rbindlist must be a list. This list can contain data.tables, data.frames or plain lists.");
+  Rboolean usenames = LOGICAL(usenamesArg)[0];
+  const bool fill = LOGICAL(fillArg)[0];
+  if (fill && usenames!=TRUE) {
+    if (usenames==FALSE) warning("use.names= cannot be FALSE when fill is TRUE. Setting use.names=TRUE."); // else no warning if usenames==NA (default)
+    usenames=TRUE;
   }
-  d->nmax = n;
-}
-
-#define IMAX 4294967296L
-static void HashTableSetup(HashData *d, RLEN n)
-{
-  d->hash = shash;
-  d->equal = sequal;
-  MKsetup(d, n);
-  //d->HashTable = malloc(sizeof(struct llist *) * (d->M));
-  //if (d->HashTable == NULL) error("malloc failed in rbindlist.c. This part of the code will be reworked.");
-  d->HashTable = (struct llist **)R_alloc(d->M, sizeof(struct llist *));
-  for (RLEN i = 0; i < d->M; i++) d->HashTable[i] = NULL;
-}
-/*
-static void CleanHashTable(HashData *d)
-{
-  struct llist * root, * tmp;
-
-  for (RLEN i = 0; i < d->M; ++i) {
-    root = d->HashTable[i];
-    while (root != NULL) {
-      tmp = root->next;
-      free(root);
-      root = tmp;
+  const bool idcol = !isNull(idcolArg);
+  if (idcol && (!isString(idcolArg) || LENGTH(idcolArg)!=1)) error("Internal error: rbindlist.c idcol is not a single string");  // # nocov
+  int ncol=0, first=0;
+  int64_t nrow=0, upperBoundUniqueNames=1;
+  bool anyNames=false;
+  int numZero=0, firstZeroCol=0, firstZeroItem=0;
+  int *eachMax = (int *)R_alloc(LENGTH(l), sizeof(int));
+  // pre-check for any errors here to save having to get cleanup right below when usenames
+  for (int i=0; i<LENGTH(l); i++) {  // length(l)>0 checked above
+    eachMax[i] = 0;
+    SEXP li = VECTOR_ELT(l, i);
+    if (isNull(li)) continue;
+    if (TYPEOF(li) != VECSXP) error("Item %d of input is not a data.frame, data.table or list", i+1);
+    const int thisncol = length(li);
+    if (!thisncol) continue;
+    // delete as now more flexible ... if (fill && isNull(getAttrib(li, R_NamesSymbol))) error("When fill=TRUE every item of the input must have column names. Item %d does not.", i+1);
+    if (fill) {
+      if (thisncol>ncol) ncol=thisncol;  // this section initializes ncol with max ncol. ncol may be increased when usenames is accounted for further down
+    } else {
+      if (ncol==0) { ncol=thisncol; first=i; }
+      else if (thisncol!=ncol) error("Item %d has %d columns, inconsistent with item %d which has %d columns. To fill missing columns use fill=TRUE.", i+1, thisncol, first+1, ncol);
+    }
+    int nNames = length(getAttrib(li, R_NamesSymbol));
+    if (nNames>0 && nNames!=thisncol) error("Item %d has %d columns but %d column names. Invalid object.", i+1, thisncol, nNames);
+    if (nNames>0) anyNames=true;
+    upperBoundUniqueNames += nNames;
+    int maxLen=0, whichMax=0;
+    for (int j=0; j<thisncol; ++j) { int tt=length(VECTOR_ELT(li,j)); if (tt>maxLen) { maxLen=tt; whichMax=j; } }
+    for (int j=0; j<thisncol; ++j) {
+      int tt = length(VECTOR_ELT(li, j));
+      if (tt>1 && tt!=maxLen) error("Column %d of item %d is length %d inconsistent with column %d which is length %d. Only length-1 columns are recycled.", j+1, i+1, tt, whichMax+1, maxLen);
+      if (tt==0 && maxLen>0 && numZero++==0) { firstZeroCol = j; firstZeroItem=i; }
     }
+    eachMax[i] = maxLen;
+    nrow += maxLen;
   }
-  free(d->HashTable);
-}
-*/
-
-// factorType is 1 for factor and 2 for ordered
-// will simply unique normal factors and attempt to find global order for ordered ones
-SEXP combineFactorLevels(SEXP factorLevels, int * factorType, Rboolean * isRowOrdered) {
-  // find total length
-  RLEN size = 0;
-  R_len_t len = LENGTH(factorLevels), n, i, j;
-  for (i = 0; i < len; ++i) {
-    SEXP elem = VECTOR_ELT(factorLevels, i);
-    n = LENGTH(elem);
-    size += n;
-    /* for (j = 0; j < n; ++j) { */
-    /*     if(IS_BYTES(STRING_ELT(elem, j))) { */
-    /*         data.useUTF8 = FALSE; break; */
-    /*     } */
-    /*     if(ENC_KNOWN(STRING_ELT(elem, j))) { */
-    /*         data.useUTF8 = TRUE; */
-    /*     } */
-    /*     if(!IS_CACHED(STRING_ELT(elem, j))) { */
-    /*         data.useCache = FALSE; break; */
-    /*     } */
-    /* } */
+  if (numZero) {  // #1871
+    SEXP names = getAttrib(VECTOR_ELT(l, firstZeroItem), R_NamesSymbol);
+    const char *ch = names==R_NilValue ? "" : CHAR(STRING_ELT(names, firstZeroCol));
+    warning("Column %d ['%s'] of item %d is length 0. This (and %d other%s like it) has been filled with NA (NULL for list columns) to make each item uniform.",
+            firstZeroCol+1, ch, firstZeroItem+1, numZero-1, numZero==2?"":"s");
   }
-
-  // set up hash to put duplicates in
-  HashData data;
-  data.useUTF8 = FALSE;
-  data.useCache = TRUE;
-  HashTableSetup(&data, size);
-
-  struct llist **h = data.HashTable;
-  hlen idx;
-  struct llist * pl;
-  R_len_t uniqlen = 0;
-  // we insert in opposite order because it's more convenient later to choose first of the duplicates
-  for (i = len-1; i >= 0; --i) {
-    SEXP elem = VECTOR_ELT(factorLevels, i);
-    n = LENGTH(elem);
-    for (j = n-1; j >= 0; --j) {
-      idx = data.hash(elem, j, &data);
-      while (h[idx] != NULL) {
-        pl = h[idx];
-        if (data.equal(VECTOR_ELT(factorLevels, pl->i), pl->j, elem, j))
-          break;
-        // it's a collision, not a match, so iterate to a new spot
-        idx = (idx + 1) % data.M;
-      }
-      if (data.nmax-- < 0) error("hash table is full");
-
-      pl = (struct llist *)R_alloc(1, sizeof(struct llist));
-      pl->next = NULL;
-      pl->i = i;
-      pl->j = j;
-      if (h[idx] != NULL) {
-        pl->next = h[idx];
-      } else {
-        ++uniqlen;
+  if (nrow==0 && ncol==0) return(R_NilValue);
+  if (nrow>INT32_MAX) error("Total rows in the list is %lld which is larger than the maximum number of rows, currently %d", nrow, INT32_MAX);
+  if (usenames==TRUE && !anyNames) error("use.names=TRUE but no item of input list has any names");
+
+  int *colMap=NULL; // maps each column in final result to the column of each list item
+  if (usenames==TRUE || usenames==NA_LOGICAL) {
+    // here we proceed as if fill=true for brevity (accounting for dups is tricky) and then catch any missings after this branch
+    // when use.names==NA we also proceed here as if use.names was TRUE to save new code and then check afterwards the map is 1:ncol for every item
+    // first find number of unique column names present; i.e. length(unique(unlist(lapply(l,names))))
+    SEXP *uniq = (SEXP *)malloc(upperBoundUniqueNames * sizeof(SEXP));  // upperBoundUniqueNames was initialized with 1 to ensure this is defined (otherwise 0 when no item has names)
+    if (!uniq) error("Failed to allocate upper bound of %lld unique column names [sum(lapply(l,ncol))]", upperBoundUniqueNames);
+    savetl_init();
+    int nuniq=0;
+    for (int i=0; i<LENGTH(l); i++) {
+      SEXP li = VECTOR_ELT(l, i);
+      int thisncol=LENGTH(li);
+      if (isNull(li) || !LENGTH(li)) continue;
+      const SEXP cn = getAttrib(li, R_NamesSymbol);
+      if (!length(cn)) continue;
+      const SEXP *cnp = STRING_PTR(cn);
+      for (int j=0; j<thisncol; j++) {
+        SEXP s = cnp[j];
+        if (TRUELENGTH(s)<0) continue;  // seen this name before
+        if (TRUELENGTH(s)>0) savetl(s);
+        uniq[nuniq++] = s;
+        SET_TRUELENGTH(s,-nuniq);
       }
-      h[idx] = pl;
     }
-  }
-
-  SEXP finalLevels = PROTECT(allocVector(STRSXP, uniqlen)); // UNPROTECTed at the end of this function
-  R_len_t counter = 0;
-  if (*factorType == 2) {
-    int *locs = (int *)R_alloc(len, sizeof(int));
-    for (int i=0; i<len; i++) locs[i] = 0;
-    // note there's a goto (!!) normalFactor below. When locs was allocated with malloc, the goto jumped over the
-    // old free() and caused leak. Now uses the safer R_alloc.  TODO - review all this logic.
-
-    R_len_t k;
-    SEXP tmp;
-    for (i = 0; i < len; ++i) {
-      if (!isRowOrdered[i]) continue;
-      SEXP elem = VECTOR_ELT(factorLevels, i);
-      n = LENGTH(elem);
-      for (j = locs[i]; j < n; ++j) {
-        idx = data.hash(elem, j, &data);
-        while (h[idx] != NULL) {
-          pl = h[idx];
-          if (data.equal(VECTOR_ELT(factorLevels, pl->i), pl->j, elem, j)) {
-            do {
-              if (!isRowOrdered[pl->i]) continue;
-
-              tmp = VECTOR_ELT(factorLevels, pl->i);
-              if (locs[pl->i] > pl->j) {
-                // failed to construct global order, need to break out of too many loops
-                // so will use goto :o
-                warning("ordered factor levels cannot be combined, going to convert to simple factor instead");
-                counter = 0;
-                *factorType = 1;
-                goto normalFactor;
-              }
-
-              for (k = locs[pl->i]; k < pl->j; ++k) {
-                SET_STRING_ELT(finalLevels, counter++, STRING_ELT(tmp, k));
-              }
-              locs[pl->i] = pl->j + 1;
-            } while ( (pl = pl->next) ); // added parenthesis to remove compiler warning 'suggest parentheses around assignment used as truth value'
-            SET_STRING_ELT(finalLevels, counter++, STRING_ELT(elem, j));
-            break;
-          }
-          // it's a collision, not a match, so iterate to a new spot
-          idx = (idx + 1) % data.M;
-        }
-        if (h[idx] == NULL) error("internal hash error, please report to data.table issue tracker");
-      }
+    if (nuniq>0) {
+      SEXP *tt = realloc(uniq, nuniq*sizeof(SEXP));  // shrink to only what we need to release the spare
+      if (!tt) free(uniq);  // shrink never fails; just keep codacy happy
+      uniq = tt;
     }
-
-    // fill in the rest of the unordered elements
-    Rboolean record;
-    for (i = 0; i < len; ++i) {
-      if (isRowOrdered[i]) continue;
-      SEXP elem = VECTOR_ELT(factorLevels, i);
-      n = LENGTH(elem);
-      for (j = 0; j < n; ++j) {
-        idx = data.hash(elem, j, &data);
-        while (h[idx] != NULL) {
-          pl = h[idx];
-          if (data.equal(VECTOR_ELT(factorLevels, pl->i), pl->j, elem, j)) {
-            // Fixes #899. "rest" can have identical levels in
-            // more than 1 data.table.
-            if (!(pl->i == i && pl->j == j)) break;
-            record = TRUE;
-            do {
-              // if this element was in an ordered list, it's been recorded already
-              if (isRowOrdered[pl->i]) {
-                record = FALSE;
-                break;
-              }
-            } while ( (pl = pl->next) ); // added parenthesis to remove compiler warning 'suggest parentheses around assignment used as truth value'
-            if (record)
-              SET_STRING_ELT(finalLevels, counter++, STRING_ELT(elem, j));
-
-            break;
-          }
-          // it's a collision, not a match, so iterate to a new spot
-          idx = (idx + 1) % data.M;
-        }
-        if (h[idx] == NULL) error("internal hash error, please report to data.table issue tracker");
+    // now count the dups (if any) and how they're distributed across the items
+    int *counts = (int *)calloc(nuniq, sizeof(int)); // counts of names for each colnames
+    int *maxdup = (int *)calloc(nuniq, sizeof(int)); // the most number of dups for any name within one colname vector
+    if (!counts || !maxdup) {
+      // # nocov start
+      for (int i=0; i<nuniq; ++i) SET_TRUELENGTH(uniq[i], 0);
+      free(uniq); free(counts); free(maxdup);
+      savetl_end();
+      error("Failed to allocate nuniq=%d items working memory in rbindlist.c", nuniq);
+      // # nocov end
+    }
+    for (int i=0; i<LENGTH(l); i++) {
+      SEXP li = VECTOR_ELT(l, i);
+      int thisncol=length(li);
+      if (thisncol==0) continue;
+      const SEXP cn = getAttrib(li, R_NamesSymbol);
+      if (!length(cn)) continue;
+      const SEXP *cnp = STRING_PTR(cn);
+      memset(counts, 0, nuniq*sizeof(int));
+      for (int j=0; j<thisncol; j++) {
+        SEXP s = cnp[j];
+        counts[ -TRUELENGTH(s)-1 ]++;
+      }
+      for (int u=0; u<nuniq; u++) {
+        if (counts[u] > maxdup[u]) maxdup[u] = counts[u];
       }
     }
-  }
-
- normalFactor:
-  if (*factorType == 1) {
-    for (i = 0; i < len; ++i) {
-      SEXP elem = VECTOR_ELT(factorLevels, i);
-      n = LENGTH(elem);
-      for (j = 0; j < n; ++j) {
-        idx = data.hash(elem, j, &data);
-        while (h[idx] != NULL) {
-          pl = h[idx];
-          if (data.equal(VECTOR_ELT(factorLevels, pl->i), pl->j, elem, j)) {
-            if (pl->i == i && pl->j == j) {
-              SET_STRING_ELT(finalLevels, counter++, STRING_ELT(elem, j));
+    int ttncol = 0;
+    for (int u=0; u<nuniq; ++u) ttncol+=maxdup[u];
+    if (ttncol>ncol) ncol=ttncol;
+    free(maxdup); maxdup=NULL;  // not needed again
+    // ncol is now the final number of columns accounting for unique and dups across all colnames
+    // allocate a matrix:  nrows==length(list)  each entry contains which column to fetch for that final column
+
+    int *colMapRaw = (int *)malloc(LENGTH(l)*ncol * sizeof(int));  // the result of this scope used later
+    int *uniqMap = (int *)malloc(ncol * sizeof(int)); // maps the ith unique string to the first time it occurs in the final result
+    int *dupLink = (int *)malloc(ncol * sizeof(int)); // if a colname has occurred before (a dup) links from the 1st to the 2nd time in the final result, 2nd to 3rd, etc
+    if (!colMapRaw || !uniqMap || !dupLink) {
+      // # nocov start
+      for (int i=0; i<nuniq; ++i) SET_TRUELENGTH(uniq[i], 0);
+      free(uniq); free(counts); free(colMapRaw); free(uniqMap); free(dupLink);
+      savetl_end();
+      error("Failed to allocate ncol=%d items working memory in rbindlist.c", ncol);
+      // # nocov end
+    }
+    for (int i=0; i<LENGTH(l)*ncol; ++i) colMapRaw[i]=-1;   // 0-based so use -1
+    for (int i=0; i<ncol; ++i) {uniqMap[i] = dupLink[i] = -1;}
+    int nextCol=0, lastDup=ncol-1;
+
+    for (int i=0; i<LENGTH(l); ++i) {
+      SEXP li = VECTOR_ELT(l, i);
+      int thisncol=length(li);
+      if (thisncol==0) continue;
+      const SEXP cn = getAttrib(li, R_NamesSymbol);
+      if (!length(cn)) {
+        for (int j=0; j<thisncol; j++) colMapRaw[i*ncol + j] = j;
+      } else {
+        const SEXP *cnp = STRING_PTR(cn);
+        memset(counts, 0, nuniq*sizeof(int));
+        for (int j=0; j<thisncol; j++) {
+          SEXP s = cnp[j];
+          int w = -TRUELENGTH(s)-1;
+          int wi = counts[w]++; // how many dups have we seen before of this name within this item
+          if (uniqMap[w]==-1) {
+            // first time seen this name across all items
+            uniqMap[w] = nextCol++;
+          } else {
+            while (wi && dupLink[w]>0) { w=dupLink[w]; --wi; }  // hop through the dups
+            if (wi && dupLink[w]==-1) {
+              // first time we've seen this number of dups of this name
+              w = dupLink[w] = lastDup--;
+              uniqMap[w] = nextCol++;
             }
-            break;
           }
-          // it's a collision, not a match, so iterate to a new spot
-          idx = (idx + 1) % data.M;
+          colMapRaw[i*ncol + uniqMap[w]] = j;
         }
-        if (h[idx] == NULL) error("internal hash error, please report to data.table issue tracker");
       }
     }
+    for (int i=0; i<nuniq; ++i) SET_TRUELENGTH(uniq[i], 0);  // zero out our usage of tl
+    free(uniq); free(counts); free(uniqMap); free(dupLink);  // all local scope so no need to set to NULL
+    savetl_end();  // restore R's usage
+
+    // colMapRaw is still allocated. It was allocated with malloc because we needed to catch if the alloc failed.
+    // move it to R's heap so it gets automatically free'd on exit, and on any error between now and the end of rbindlist.
+    colMap = (int *)R_alloc(LENGTH(l)*ncol, sizeof(int));
+    // This R_alloc could fail with out-of-memory but given it is very small it's very unlikely. If it does fail, colMapRaw will leak.
+    //   But colMapRaw leaking now in this very rare situation is better than colMapRaw leaking in the more likely but still rare conditions later.
+    //   And it's better than having to trap all exit point from here to the end of rbindlist, which may not be possible; e.g. writeNA() could error inside it with unsupported type.
+    //   This very unlikely leak could be fixed by using an on.exit() at R level rbindlist(); R-exts$6.1.2 refers to pwilcox for example. However, that would not
+    //   solve the (mere) leak if we ever call rbindlist internally from other C functions.
+    memcpy(colMap, colMapRaw, LENGTH(l)*ncol*sizeof(int));
+    free(colMapRaw);  // local scope in this branch to ensure can't be used below
+
+    // to view map when debugging ...
+    // for (int i=0; i<LENGTH(l); ++i) { for (int j=0; j<ncol; ++j) Rprintf("%2d ",colMap[i*ncol + j]);  Rprintf("\n"); }
   }
 
-  // CleanHashTable(&data);   No longer needed now we use R_alloc(). But the hash table approach
-  // will be removed completely at some point.
-  UNPROTECT(1); // finalLevels
-  return finalLevels;
-}
-
-
-/* Arun's addition and changes to incorporate 'usenames=T/F' and 'fill=T/F' arguments to rbindlist */
-
-/*
-  l               = input list of data.tables/lists/data.frames
-  n               = length(l)
-  ans             = rbind'd result
-  i               = an index over length of l
-  use.names       = whether binding should check for names and bind them accordingly
-  fill            = whether missing columns should be filled with NAs
-
-  ans_ptr         = final_names - column names for 'ans' (list item 1)
-  ans_ptr         = match_indices - when use.names=TRUE, for each element in l, what's the destination col index (in 'ans') for each of the cols (list item 2)
-  n_rows          = total number of rows in 'ans'
-  n_cols          = total number of cols in 'ans'
-  max_type        = for each col in 'ans', what's the final SEXPTYPE? (for coercion if necessary)
-  is_factor       = for each col in 'ans' mark which one's a factor (to convert to factor at the end)
-  is_ofactor      = for each col in 'ans' mark which one's an ordered factor (to convert to ordered factor at the end)
-  fn_rows         = the length of first column (rows) for each item in l.
-  mincol          = get the minimum number of columns in an item from l. Used to check if 'fill=TRUE' is really necessary, even if set.
-*/
-
-struct preprocessData {
-  SEXP ans_ptr, colname;
-  size_t n_rows, n_cols;
-  int *fn_rows, *is_factor, first, lcount, mincol, protecti;
-  SEXPTYPE *max_type;
-};
-
-static SEXP unlist2(SEXP v) {
-
-  RLEN i, j, k=0, ni, n=0;
-  SEXP ans, vi, lnames, groups, runids;
-
-  for (i=0; i<length(v); i++) n += length(VECTOR_ELT(v, i));
-  ans    = PROTECT(allocVector(VECSXP, 3));
-  lnames = PROTECT(allocVector(STRSXP, n));
-  groups = PROTECT(allocVector(INTSXP, n));
-  runids = PROTECT(allocVector(INTSXP, n));
-  for (i=0; i<length(v); i++) {
-    vi = VECTOR_ELT(v, i);
-    ni = length(vi);
-    for (j=0; j<ni; j++) {
-      SET_STRING_ELT(lnames, k + j, STRING_ELT(vi, j));
-      INTEGER(groups)[k + j] = i+1;
-      INTEGER(runids)[k + j] = j;
-    }
-    k+=j;
-  }
-  SET_VECTOR_ELT(ans, 0, lnames);
-  SET_VECTOR_ELT(ans, 1, groups);
-  SET_VECTOR_ELT(ans, 2, runids);
-  UNPROTECT(4);
-  return(ans);
-}
-
-// Don't use elsewhere. No checks are made on byArg and handleSorted
-// if handleSorted is 0, then it'll return integer(0) as such when
-// input is already sorted, like forder. if not, seq_len(nrow(dt)).
-static SEXP fast_order(SEXP dt, R_len_t byArg, R_len_t handleSorted) {
-
-  R_len_t i, protecti=0;
-  SEXP ans, by=R_NilValue, retGrp, sortStr, order, na, starts;
-
-  retGrp  = PROTECT(allocVector(LGLSXP, 1)); LOGICAL(retGrp)[0]  = TRUE;   protecti++;
-  sortStr = PROTECT(allocVector(LGLSXP, 1)); LOGICAL(sortStr)[0] = FALSE;  protecti++;
-  na      = PROTECT(allocVector(LGLSXP, 1)); LOGICAL(na)[0] = FALSE;       protecti++;
-
-  if (byArg) {
-    by    = PROTECT(allocVector(INTSXP, byArg));                           protecti++;
-    order = PROTECT(allocVector(INTSXP, byArg));                           protecti++;
-    for (i=0; i<byArg; i++) {
-      INTEGER(by)[i] = i+1;
-      INTEGER(order)[i] = 1;
+  if (fill && usenames==NA_LOGICAL) error("Internal error: usenames==NA but fill=TRUE. usenames should have been set to TRUE earlier with warning.");
+  if (!fill && (usenames==TRUE || usenames==NA_LOGICAL)) {
+    // Ensure no missings in both cases, and (when usenames==NA) all columns in same order too
+    // We proceeded earlier as if fill was true, so varying ncol items will have missings here
+    const char *warnStr = usenames==NA_LOGICAL?" use.names='check' (default from v1.12.2) generates this warning and proceeds as if use.names=FALSE for backwards compatibility; TRUE in future.":"";
+    for (int i=0; i<LENGTH(l); ++i) {
+      SEXP li = VECTOR_ELT(l, i);
+      if (!length(li) || !length(getAttrib(li, R_NamesSymbol))) continue;
+      for (int j=0; j<ncol; ++j) {
+        const int w = colMap[i*ncol + j];
+        if (w==-1) {
+          int missi = i;
+          while (colMap[i*ncol + j]==-1 && i<LENGTH(l)) i++;
+          if (i==LENGTH(l)) error("Internal error: could not find the first column name not present in earlier item");
+          SEXP s = getAttrib(VECTOR_ELT(l, i), R_NamesSymbol);
+          int w2 = colMap[i*ncol + j];
+          const char *str = isString(s) ? CHAR(STRING_ELT(s,w2)) : "";
+          (usenames==TRUE ? error : warning)(
+            "Column %d ['%s'] of item %d is missing in item %d. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names.%s",
+            w2+1, str, i+1, missi+1, warnStr );
+          i = LENGTH(l); // break from outer i loop
+          break;         // break from inner j loop
+        }
+        if (w!=j && usenames==NA_LOGICAL) {
+          SEXP s = getAttrib(VECTOR_ELT(l, i), R_NamesSymbol);
+          if (!isString(s) || i==0) error("Internal error: usenames==NA but an out-of-order name has been found in an item with no names or the first item. [%d]", i);
+          warning("Column %d ['%s'] of item %d appears in position %d in item %d. Set use.names=TRUE to match by column name, or use.names=FALSE to ignore column names.%s",
+                  w+1, CHAR(STRING_ELT(s,w)), i+1, j+1, i, warnStr);
+          i = LENGTH(l);
+          break;
+        }
+      }
     }
-  } else {
-    order = PROTECT(allocVector(INTSXP, 1)); INTEGER(order)[0] = 1;        protecti++;
-  }
-  ans = PROTECT(forder(dt, by, retGrp, sortStr, order, na));               protecti++;
-  if (!length(ans) && handleSorted != 0) {
-    starts = PROTECT(getAttrib(ans, sym_starts));                          protecti++;
-    // if cols are already sorted, 'forder' gives integer(0), got to replace it with 1:.N
-    ans = PROTECT(allocVector(INTSXP, length(VECTOR_ELT(dt, 0))));         protecti++;
-    for (i=0; i<length(ans); i++) INTEGER(ans)[i] = i+1;
-    // TODO: for loop appears redundant because length(ans)==0 due to if (!length(ans)) above
-    setAttrib(ans, sym_starts, starts);
-  }
-  UNPROTECT(protecti);
-  return(ans);
-}
-
-static SEXP uniq_lengths(SEXP v, R_len_t n) {
-  R_len_t nv=length(v);
-  SEXP ans = PROTECT(allocVector(INTSXP, nv));
-  for (R_len_t i=1; i<nv; i++) {
-    INTEGER(ans)[i-1] = INTEGER(v)[i] - INTEGER(v)[i-1];
   }
-  if (nv>0) {
-    // last value
-    INTEGER(ans)[nv-1] = n - INTEGER(v)[nv-1] + 1;
+  if (usenames==NA_LOGICAL) {
+    usenames=FALSE;  // for backwards compatibility, see warning above which says this will change to TRUE in future
+    ncol = length(VECTOR_ELT(l, first));  // ncol was increased as if fill=true, so reduce it back given fill=false (fill==false checked above)
   }
-  UNPROTECT(1);
-  return(ans);
-}
-
-static SEXP match_names(SEXP v) {
-
-  R_len_t i, j, idx, ncols, protecti=0;
-  SEXP ans, dt, lnames, ti;
-  SEXP uorder, starts, ulens, index, firstofeachgroup, origorder;
-  SEXP fnames, findices, runid, grpid;
-
-  ans    = PROTECT(allocVector(VECSXP, 2));
-  dt     = PROTECT(unlist2(v)); protecti++;
-  lnames = VECTOR_ELT(dt, 0);
-  grpid  = PROTECT(duplicate(VECTOR_ELT(dt, 1))); protecti++; // dt[1] will be reused, so backup
-  runid  = VECTOR_ELT(dt, 2);
 
-  uorder = PROTECT(fast_order(dt, 2, 1));  protecti++; // byArg alone is set, everything else is set inside fast_order
-  starts = getAttrib(uorder, sym_starts);
-  ulens  = PROTECT(uniq_lengths(starts, length(lnames))); protecti++;
-
-  // seq_len(.N) for each group
-  index = PROTECT(VECTOR_ELT(dt, 1)); protecti++; // reuse dt[1] (in 0-index coordinate), value already backed up above.
-  for (i=0; i<length(ulens); i++) {
-    for (j=0; j<INTEGER(ulens)[i]; j++)
-      INTEGER(index)[INTEGER(uorder)[INTEGER(starts)[i]-1+j]-1] = j;
-  }
-  // order again
-  uorder = PROTECT(fast_order(dt, 2, 1));  protecti++; // byArg alone is set, everything else is set inside fast_order
-  starts = getAttrib(uorder, sym_starts);
-  ulens  = PROTECT(uniq_lengths(starts, length(lnames))); protecti++;
-  ncols  = length(starts);
-  // check if order has to be changed (bysameorder = FALSE here by default - in `[.data.table` parlance)
-  firstofeachgroup = PROTECT(allocVector(INTSXP, length(starts)));
-  for (i=0; i<ncols; i++) INTEGER(firstofeachgroup)[i] = INTEGER(uorder)[INTEGER(starts)[i]-1];
-  origorder = PROTECT(fast_order(firstofeachgroup, 0, 0));
-  if (length(origorder)) {
-    reorder(starts, origorder);
-    reorder(ulens, origorder);
-  }
-  UNPROTECT(2);
-  // get fnames and findices
-  fnames   = PROTECT(allocVector(STRSXP, ncols)); protecti++;
-  findices = PROTECT(allocVector(VECSXP, ncols)); protecti++;
-  for (i=0; i<ncols; i++) {
-    idx = INTEGER(uorder)[INTEGER(starts)[i]-1]-1;
-    SET_STRING_ELT(fnames, i, STRING_ELT(lnames, idx));
-    ti = PROTECT(allocVector(INTSXP, length(v)));
-    for (j=0;j<length(v);j++) INTEGER(ti)[j]=-1; // TODO: can we eliminate this?
-    for (j=0; j<INTEGER(ulens)[i]; j++) {
-      idx = INTEGER(uorder)[INTEGER(starts)[i]-1+j]-1;
-      INTEGER(ti)[INTEGER(grpid)[idx]-1] = INTEGER(runid)[idx];
-    }
-    UNPROTECT(1);
-    SET_VECTOR_ELT(findices, i, ti);
-  }
-  UNPROTECT(protecti);
-  SET_VECTOR_ELT(ans, 0, fnames);
-  SET_VECTOR_ELT(ans, 1, findices);
-  UNPROTECT(1); // ans
-  return(ans);
-}
-
-static void preprocess(SEXP l, Rboolean usenames, Rboolean fill, struct preprocessData *data) {
-
-  R_len_t i, j, idx;
-  SEXP li, lnames=R_NilValue, fnames, findices=R_NilValue, f_ind=R_NilValue, thiscol, col_name=R_NilValue, thisClass = R_NilValue;
-  SEXPTYPE type;
-
-  data->first = -1; data->lcount = 0; data->n_rows = 0; data->n_cols = 0; data->protecti = 0;
-  data->max_type = NULL; data->is_factor = NULL; data->ans_ptr = R_NilValue; data->mincol=0;
-  data->fn_rows = (int *)R_alloc(LENGTH(l), sizeof(int));
-  data->colname = R_NilValue;
-
-  // get first non null name, 'rbind' was doing a 'match.names' for each item.. which is a bit more time consuming.
-  // And warning that it'll be matched by names is not necessary, I think, as that's the default for 'rbind'. We
-  // should instead document it.
-  for (i=0; i<LENGTH(l); i++) { // isNull is checked already in rbindlist
-    li = VECTOR_ELT(l, i);
-    if (isNull(li)) continue;
-    if (TYPEOF(li) != VECSXP) error("Item %d of list input is not a data.frame, data.table or list",i+1);
-    if (!LENGTH(li)) continue;
-    col_name = getAttrib(li, R_NamesSymbol);
-    if (!isNull(col_name)) break;
-  }
-  if (!isNull(col_name)) { data->colname = PROTECT(col_name); data->protecti++; }
-  if (usenames) { lnames = PROTECT(allocVector(VECSXP, LENGTH(l))); data->protecti++;}
-  for (i=0; i<LENGTH(l); i++) {
-    data->fn_rows[i] = 0;  // careful to initialize before continues as R_alloc above doesn't initialize
-    li = VECTOR_ELT(l, i);
-    if (isNull(li)) continue;
-    if (TYPEOF(li) != VECSXP) error("Item %d of list input is not a data.frame, data.table or list",i+1);
-    if (!LENGTH(li)) continue;
-    col_name = getAttrib(li, R_NamesSymbol);
-    if (fill && isNull(col_name))
-      error("fill=TRUE, but names of input list at position %d is NULL. All items of input list must have names set when fill=TRUE.", i+1);
-    data->lcount++;
-    data->fn_rows[i] = length(VECTOR_ELT(li, 0));
-    if (data->first == -1) {
-      data->first = i;
-      data->n_cols = LENGTH(li);
-      data->mincol = LENGTH(li);
-      if (!usenames) {
-        data->ans_ptr = PROTECT(allocVector(VECSXP, 2)); data->protecti++;
-        if (isNull(col_name)) SET_VECTOR_ELT(data->ans_ptr, 0, data->colname);
-        else SET_VECTOR_ELT(data->ans_ptr, 0, col_name);
-      } else {
-        if (isNull(col_name)) SET_VECTOR_ELT(lnames, i, data->colname);
-        else SET_VECTOR_ELT(lnames, i, col_name);
+  SEXP ans=PROTECT(allocVector(VECSXP, idcol + ncol)), ansNames;
+  setAttrib(ans, R_NamesSymbol, ansNames=allocVector(STRSXP, idcol + ncol));
+  if (idcol) {
+    SET_STRING_ELT(ansNames, 0, STRING_ELT(idcolArg, 0));
+    SEXP idval, listNames=getAttrib(l, R_NamesSymbol);
+    if (length(listNames)) {
+      SET_VECTOR_ELT(ans, 0, idval=allocVector(STRSXP, nrow));
+      for (int i=0,ansloc=0; i<LENGTH(l); ++i) {
+        SEXP li = VECTOR_ELT(l, i);
+        if (!length(li)) continue;
+        const int thisnrow = length(VECTOR_ELT(li, 0));
+        SEXP thisname = STRING_ELT(listNames, i);
+        for (int k=0; k<thisnrow; ++k) SET_STRING_ELT(idval, ansloc++, thisname);
       }
-      data->n_rows += data->fn_rows[i];
-      continue;
     } else {
-      if (!fill && LENGTH(li) != data->n_cols)
-        if (LENGTH(li) != data->n_cols) error("Item %d has %d columns, inconsistent with item %d which has %d columns. If instead you need to fill missing columns, use set argument 'fill' to TRUE.",i+1, LENGTH(li), data->first+1, data->n_cols);
-    }
-    if (data->mincol > LENGTH(li)) data->mincol = LENGTH(li);
-    data->n_rows += data->fn_rows[i];
-    if (usenames) {
-      if (isNull(col_name)) SET_VECTOR_ELT(lnames, i, data->colname);
-      else SET_VECTOR_ELT(lnames, i, col_name);
+      SET_VECTOR_ELT(ans, 0, idval=allocVector(INTSXP, nrow));
+      int *idvald = INTEGER(idval);
+      for (int i=0,ansloc=0; i<LENGTH(l); ++i) {
+        SEXP li = VECTOR_ELT(l, i);
+        if (!length(li)) continue;
+        const int thisnrow = length(VECTOR_ELT(li, 0));
+        for (int k=0; k<thisnrow; ++k) idvald[ansloc++] = i+1;
+      }
     }
   }
-  if (usenames) {
-    data->ans_ptr = PROTECT(match_names(lnames)); data->protecti++;
-    fnames = VECTOR_ELT(data->ans_ptr, 0);
-    findices = VECTOR_ELT(data->ans_ptr, 1);
-    if (isNull(data->colname) && data->n_cols > 0)
-      error("use.names=TRUE but no item of input list has any names.\n");
-    if (!fill && length(fnames) != data->mincol) {
-      error("Answer requires %d columns whereas one or more item(s) in the input list has only %d columns. This could be because the items in the list may not all have identical column names or some of the items may have duplicate names. In either case, if you're aware of this and would like to fill those missing columns, set the argument 'fill=TRUE'.", length(fnames), data->mincol);
-    } else data->n_cols = length(fnames);
-  }
 
-  // decide type of each column
-  // initialize the max types - will possibly increment later
-  data->max_type  = (SEXPTYPE *)R_alloc(data->n_cols, sizeof(SEXPTYPE));
-  data->is_factor = (int *)R_alloc(data->n_cols, sizeof(int));
-  for (i = 0; i< data->n_cols; i++) {
-    thisClass = R_NilValue;
-    data->max_type[i] = 0;
-    data->is_factor[i] = 0;
-    if (usenames) f_ind = VECTOR_ELT(findices, i);
-    for (j=data->first; j<LENGTH(l); j++) {
-      if (data->is_factor[i] == 2) break;
-      idx = (usenames) ? INTEGER(f_ind)[j] : i;
-      li = VECTOR_ELT(l, j);
-      if (isNull(li) || !LENGTH(li) || idx < 0) continue;
-      thiscol = VECTOR_ELT(li, idx);
-      // Fix for #705, check attributes
-      if (j == data->first)
-        thisClass = getAttrib(thiscol, R_ClassSymbol);
-      if (isFactor(thiscol)) {
-        data->is_factor[i] = (isOrdered(thiscol)) ? 2 : 1;
-        data->max_type[i]  = STRSXP;
-      } else {
-        // Fix for #705, check attributes and error if non-factor class and not identical
-        if (!data->is_factor[i] &&
-          !R_compute_identical(thisClass, getAttrib(thiscol, R_ClassSymbol), 0) && !fill) {
-          error("Class attributes at column %d of input list at position %d does not match with column %d of input list at position %d. Coercion of objects of class 'factor' alone is handled internally by rbind/rbindlist at the moment.", i+1, j+1, i+1, data->first+1);
+  SEXP coercedForFactor = R_NilValue;
+  for(int j=0; j<ncol; ++j) {
+    int maxType=LGLSXP;  // initialize with LGLSXP for test 2002.3 which has col x NULL in both lists to be filled with NA for #1871
+    bool factor=false, orderedFactor=false;     // ordered factor is class c("ordered","factor"). isFactor() is true when isOrdered() is true.
+    int longestLen=0, longestW=-1, longestI=-1; // just for ordered factor
+    SEXP longestLevels=R_NilValue;              // just for ordered factor
+    bool int64=false;
+    bool foundName=false;
+    bool anyNotStringOrFactor=false;
+    SEXP firstCol=R_NilValue;
+    int firsti=-1, firstw=-1;
+    for (int i=0; i<LENGTH(l); ++i) {
+      SEXP li = VECTOR_ELT(l, i);
+      if (!length(li)) continue;
+      int w = usenames ? colMap[i*ncol + j] : j;  // colMap tells us which item to fetch for each of the final result columns, so we can stack column-by-column
+      if (w==-1) continue;  // column j of final result has no input from this item (fill must be true)
+      if (!foundName) {
+        SEXP cn=getAttrib(li, R_NamesSymbol);
+        if (length(cn)) { SET_STRING_ELT(ansNames, idcol+j, STRING_ELT(cn, w)); foundName=true; }
+      }
+      SEXP thisCol = VECTOR_ELT(li, w);
+      int thisType = TYPEOF(thisCol);
+      if (TYPEORDER(thisType)>TYPEORDER(maxType)) maxType=thisType;
+      if (isFactor(thisCol)) {
+        if (isNull(getAttrib(thisCol,R_LevelsSymbol))) error("Column %d of item %d has type 'factor' but has no levels; i.e. malformed.", w+1, i+1);
+        factor = true;
+        if (isOrdered(thisCol)) {
+          orderedFactor = true;
+          int thisLen = length(getAttrib(thisCol, R_LevelsSymbol));
+          if (thisLen>longestLen) { longestLen=thisLen; longestLevels=getAttrib(thisCol, R_LevelsSymbol); /*for warnings later ...*/longestW=w; longestI=i; }
         }
-        type = TYPEOF(thiscol);
-        if (type > data->max_type[i]) data->max_type[i] = type;
+      } else if (!isString(thisCol) && length(thisCol)) anyNotStringOrFactor=true;
+      if (INHERITS(thisCol, char_integer64)) {
+        if (firsti>=0 && !length(getAttrib(firstCol, R_ClassSymbol))) { firsti=i; firstw=w; firstCol=thisCol; } // so the integer64 attribute gets copied to target below
+        int64=true;
+      }
+      if (firsti==-1) { firsti=i; firstw=w; firstCol=thisCol; }
+      else if (!factor && !int64 && !R_compute_identical(getAttrib(thisCol, R_ClassSymbol), getAttrib(firstCol,R_ClassSymbol), 0)) {
+        error("Class attribute on column %d of item %d does not match with column %d of item %d.", w+1, i+1, firstw+1, firsti+1);
       }
     }
-  }
-}
-
-// function does c(idcol, nm), where length(idcol)=1
-// fix for #1432, + more efficient to move the logic to C
-SEXP add_idcol(SEXP nm, SEXP idcol, int cols) {
-  SEXP ans = PROTECT(allocVector(STRSXP, cols+1));
-  SET_STRING_ELT(ans, 0, STRING_ELT(idcol, 0));
-  for (int i=0; i<cols; i++) {
-    SET_STRING_ELT(ans, i+1, STRING_ELT(nm, i));
-  }
-  UNPROTECT(1);
-  return (ans);
-}
-
-SEXP rbindlist(SEXP l, SEXP sexp_usenames, SEXP sexp_fill, SEXP idcol) {
-
-  R_len_t jj, ansloc, resi, i,j,r, idx, thislen;
-  struct preprocessData data;
-  Rboolean usenames, fill, to_copy = FALSE, coerced=FALSE, isidcol = !isNull(idcol);
-  SEXP fnames = R_NilValue, findices = R_NilValue, f_ind = R_NilValue, ans, lf, li, target, thiscol, levels;
-  R_len_t protecti=0;
-
-  // first level of error checks
-  if (!isLogical(sexp_usenames) || LENGTH(sexp_usenames)!=1 || LOGICAL(sexp_usenames)[0]==NA_LOGICAL)
-    error("use.names should be TRUE or FALSE");
-  if (!isLogical(sexp_fill) || LENGTH(sexp_fill) != 1 || LOGICAL(sexp_fill)[0] == NA_LOGICAL)
-    error("fill should be TRUE or FALSE");
-  if (!length(l)) return(l);
-  if (TYPEOF(l) != VECSXP) error("Input to rbindlist must be a list of data.tables");
-
-  usenames = LOGICAL(sexp_usenames)[0];
-  fill = LOGICAL(sexp_fill)[0];
-  if (fill && !usenames) {
-    // override default
-    warning("Resetting 'use.names' to TRUE. 'use.names' can not be FALSE when 'fill=TRUE'.\n");
-    usenames=TRUE;
-  }
-
-  // check for factor, get max types, and when usenames=TRUE get the answer 'names' and column indices for proper reordering.
-  preprocess(l, usenames, fill, &data);
-  if (usenames) findices = VECTOR_ELT(data.ans_ptr, 1);
-  protecti = data.protecti;   // TODO very ugly and doesn't seem right. Assign items to list instead, perhaps.
-  if (data.n_rows == 0 && data.n_cols == 0) {
-    UNPROTECT(protecti);
-    return(R_NilValue);
-  }
-  if (data.n_rows > INT32_MAX) {
-    error("Total rows in the list is %lld which is larger than the maximum number of rows, currently %d",
-          (long long)data.n_rows, INT32_MAX);
-  }
-  fnames = VECTOR_ELT(data.ans_ptr, 0);
-  if (isidcol) {
-    fnames = PROTECT(add_idcol(fnames, idcol, data.n_cols));
-    protecti++;
-  }
-  SEXP factorLevels = PROTECT(allocVector(VECSXP, data.lcount)); protecti++;
-  Rboolean *isRowOrdered = (Rboolean *)R_alloc(data.lcount, sizeof(Rboolean));
-  for (int i=0; i<data.lcount; i++) isRowOrdered[i] = FALSE;
-
-  ans = PROTECT(allocVector(VECSXP, data.n_cols+isidcol)); protecti++;
-  setAttrib(ans, R_NamesSymbol, fnames);
-  lf = VECTOR_ELT(l, data.first);
-  for(j=0; j<data.n_cols; j++) {
-    if (fill) target = allocNAVector(data.max_type[j], data.n_rows);  // no PROTECT needed as passed immediately to SET_VECTOR_ELT
-    else target = allocVector(data.max_type[j], data.n_rows);         // no PROTECT needed as passed immediately to SET_VECTOR_ELT
-    SET_VECTOR_ELT(ans, j+isidcol, target);
 
-    if (usenames) {
-      to_copy = TRUE;
-      f_ind   = VECTOR_ELT(findices, j);
-    } else {
-      thiscol = VECTOR_ELT(lf, j);
-      if (!isFactor(thiscol)) copyMostAttrib(thiscol, target); // all but names,dim and dimnames. And if so, we want a copy here, not keepattr's SET_ATTRIB.
-    }
-    ansloc = 0;
-    jj = 0; // to increment factorLevels
-    resi = -1;
-    for (i=data.first; i<LENGTH(l); i++) {
-      li = VECTOR_ELT(l,i);
-      if (!length(li)) continue;  // majority of time though, each item of l is populated
-      thislen = data.fn_rows[i];
-      idx = (usenames) ? INTEGER(f_ind)[i] : j;
-      if (idx < 0) {
-        ansloc += thislen;
-        resi++;
-        if (data.is_factor[j]) {
-          isRowOrdered[resi] = FALSE;
-          SET_VECTOR_ELT(factorLevels, jj, allocNAVector(data.max_type[j], 1)); // the only level here is NA.
-          jj++;
+    if (!foundName) { char buff[12]; sprintf(buff,"V%d",j+1), SET_STRING_ELT(ansNames, idcol+j, mkChar(buff)); }
+    if (factor) maxType=INTSXP;  // if any items are factors then a factor is created (could be an option)
+    if (int64 && maxType!=REALSXP)
+      error("Internal error: column %d of result is determined to be integer64 but maxType=='%s' != REALSXP", j+1, type2char(maxType)); // # nocov
+    SEXP target;
+    SET_VECTOR_ELT(ans, idcol+j, target=allocVector(maxType, nrow));  // does not initialize logical & numerics, but does initialize character and list
+    if (!factor) copyMostAttrib(firstCol, target); // all but names,dim and dimnames; mainly for class. And if so, we want a copy here, not keepattr's SET_ATTRIB.
+
+    if (factor && anyNotStringOrFactor) {
+      // in future warn, or use list column instead ... warning("Column %d contains a factor but not all items for the column are character or factor", idcol+j+1);
+      // some coercing from (likely) integer/numeric to character will be needed. But this coerce can feasibly fail with out-of-memory, so we have to do it up-front
+      // before the savetl_init() because we have no hook to clean up tl if coerceVector fails.
+      if (isNull(coercedForFactor)) coercedForFactor = PROTECT(allocVector(VECSXP, LENGTH(l)));
+      for (int i=0; i<LENGTH(l); ++i) {
+        int w = usenames ? colMap[i*ncol + j] : j;
+        if (w==-1) continue;
+        SEXP thisCol = VECTOR_ELT(VECTOR_ELT(l, i), w);
+        if (!isFactor(thisCol) && !isString(thisCol)) {
+          SET_VECTOR_ELT(coercedForFactor, i, coerceVector(thisCol, STRSXP));
         }
-        continue;
-      }
-      thiscol = VECTOR_ELT(li, idx);
-      if (thislen != length(thiscol)) error("Column %d of item %d is length %d, inconsistent with first column of that item which is length %d. rbind/rbindlist doesn't recycle as it already expects each item to be a uniform list, data.frame or data.table", j+1, i+1, length(thiscol), thislen);
-      // couldn't figure out a way to this outside this loop when fill = TRUE.
-      if (to_copy && !isFactor(thiscol)) {
-        copyMostAttrib(thiscol, target);
-        to_copy = FALSE;
-      }
-      resi++;  // after the first, there might be NULL or empty which are skipped, resi increments up until lcount
-      if (TYPEOF(thiscol) != TYPEOF(target) && !isFactor(thiscol)) {
-        thiscol = PROTECT(coerceVector(thiscol, TYPEOF(target)));
-        coerced = TRUE;
-        // TO DO: options(datatable.pedantic=TRUE) to issue this warning :
-        // warning("Column %d of item %d is type '%s', inconsistent with column %d of item %d's type ('%s')",j+1,i+1,type2char(TYPEOF(thiscol)),j+1,first+1,type2char(TYPEOF(target)));
       }
-      if (TYPEOF(target)!=STRSXP && TYPEOF(thiscol)!=TYPEOF(target)) {
-        error("Internal error in rbindlist.c: type of 'thiscol' [%s] should have already been coerced to 'target' [%s]. please report to data.table issue tracker.",
-              type2char(TYPEOF(thiscol)), type2char(TYPEOF(target)));
+    }
+    int ansloc=0;
+    if (factor) {
+      char warnStr[1000] = "";
+      savetl_init();  // no error from now (or warning given options(warn=2)) until savetl_end
+      int nLevel=0, allocLevel=0;
+      SEXP *levelsRaw = NULL;  // growing list of SEXP pointers. Raw since managed with raw realloc.
+      if (orderedFactor) {
+        // If all sets of ordered levels are compatible (no ambiguities or conflicts) then an ordered factor is created, otherwise regular factor.
+        // Currently the longest set of ordered levels is taken and all other ordered levels must be a compatible subset of that.
+        // e.g. c( a<c<b, z<a<c<b, a<b ) => z<a<c<b  [ the longest is the middle one, and the other two are ordered subsets of it ]
+        //      c( a<c<b, z<c<a<b, a<b ) => regular factor because it contains an ambiguity: is a<c or c<a?
+        //      c( a<c<b, c<b, 'c,b'   ) => a<c<b  because the regular factor/character items c and b exist in the ordered levels
+        //      c( a<c<b, c<b, 'c,d'   ) => a<c<b<d  'd' from non-ordered item added on the end of longest ordered levels
+        //      c( a<c<b, c<b<d<e )  => regular factor because this case isn't yet implemented. a<c<b<d<e would be possible in future (extending longest at the beginning or end)
+        const SEXP *sd = STRING_PTR(longestLevels);
+        nLevel = allocLevel = longestLen;
+        levelsRaw = (SEXP *)malloc(nLevel * sizeof(SEXP));
+        if (!levelsRaw) { savetl_end(); error("Failed to allocate working memory for %d ordered factor levels of result column %d", nLevel, idcol+j+1); }
+        for (int k=0; k<longestLen; ++k) {
+          SEXP s = sd[k];
+          if (TRUELENGTH(s)>0) savetl(s);
+          levelsRaw[k] = s;
+          SET_TRUELENGTH(s,-k-1);
+        }
+        for (int i=0; i<LENGTH(l); ++i) {
+          int w = usenames ? colMap[i*ncol + j] : j;
+          if (w==-1) continue;
+          SEXP thisCol = VECTOR_ELT(VECTOR_ELT(l, i), w);
+          if (isOrdered(thisCol)) {
+            SEXP levels = getAttrib(thisCol, R_LevelsSymbol);
+            const SEXP *levelsD = STRING_PTR(levels);
+            const int n = length(levels);
+            for (int k=0, last=0; k<n; ++k) {
+              SEXP s = levelsD[k];
+              const int tl = TRUELENGTH(s);
+              if (tl>=last) {  // if tl>=0 then also tl>=last because last<=0
+                if (tl>=0) {
+                  sprintf(warnStr,    // not direct warning as we're inside tl region
+                  "Column %d of item %d is an ordered factor but level %d ['%s'] is missing from the ordered levels from column %d of item %d. " \
+                  "Each set of ordered factor levels should be an ordered subset of the first longest. A regular factor will be created for this column.",
+                  w+1, i+1, k+1, CHAR(s), longestW+1, longestI+1);
+                } else {
+                  sprintf(warnStr,
+                  "Column %d of item %d is an ordered factor with '%s'<'%s' in its levels. But '%s'<'%s' in the ordered levels from column %d of item %d. " \
+                  "A regular factor will be created for this column due to this ambiguity.",
+                  w+1, i+1, CHAR(levelsD[k-1]), CHAR(s), CHAR(s), CHAR(levelsD[k-1]), longestW+1, longestI+1);
+                  // k>=1 (so k-1 is ok) because when k==0 last==0 and this branch wouldn't happen
+                }
+                orderedFactor=false;
+                i=LENGTH(l);  // break outer i loop
+                break;        // break inner k loop
+                // we leave the tl set for the longest levels; the regular factor will be created with the longest ordered levels first in case that useful for user
+              }
+              last = tl;  // negative ordinal; last should monotonically grow more negative if the levels are an ordered subset of the longest
+            }
+          }
+        }
       }
-      switch(TYPEOF(target)) {
-      case STRSXP :
-        isRowOrdered[resi] = FALSE;
-        if (isFactor(thiscol)) {
-          levels = getAttrib(thiscol, R_LevelsSymbol);
-          if (isNull(levels)) error("Column %d of item %d has type 'factor' but has no levels; i.e. malformed.", j+1, i+1);
-          for (r=0; r<thislen; r++)
-            if (INTEGER(thiscol)[r]==NA_INTEGER)
-              SET_STRING_ELT(target, ansloc+r, NA_STRING);
-            else
-              SET_STRING_ELT(target, ansloc+r, STRING_ELT(levels,INTEGER(thiscol)[r]-1));
-
-          // add levels to factorLevels
-          // changed "i" to "jj" and increment 'jj' after so as to fill only non-empty tables with levels
-          SET_VECTOR_ELT(factorLevels, jj, levels); jj++;
-          if (isOrdered(thiscol)) isRowOrdered[resi] = TRUE;
+      for (int i=0; i<LENGTH(l); ++i) {
+        const int thisnrow = eachMax[i];
+        if (thisnrow==0) continue;
+        SEXP li = VECTOR_ELT(l, i);
+        int w = usenames ? colMap[i*ncol + j] : j;
+        SEXP thisCol;
+        if (w==-1 || !length(thisCol=VECTOR_ELT(li, w))) {  // !length for zeroCol warning above; #1871
+          writeNA(target, ansloc, thisnrow);
         } else {
-          if (TYPEOF(thiscol) != STRSXP) error("Internal logical error in rbindlist.c (not STRSXP), please report to data.table issue tracker.");
-          for (r=0; r<thislen; r++) SET_STRING_ELT(target, ansloc+r, STRING_ELT(thiscol,r));
-
-          // if this column is going to be a factor, add column to factorLevels
-          // changed "i" to "jj" and increment 'jj' after so as to fill only non-empty tables with levels
-          if (data.is_factor[j]) {
-            SET_VECTOR_ELT(factorLevels, jj, thiscol);
-            jj++;
+          SEXP thisColStr = isFactor(thisCol) ? getAttrib(thisCol, R_LevelsSymbol) : (isString(thisCol) ? thisCol : VECTOR_ELT(coercedForFactor, i));
+          const int n = length(thisColStr);
+          const SEXP *thisColStrD = STRING_PTR(thisColStr);  // D for data
+          for (int k=0; k<n; ++k) {
+            SEXP s = thisColStrD[k];
+            if (TRUELENGTH(s)<0) continue;  // seen this level before (handles finding unique within character columns too)
+            if (TRUELENGTH(s)>0) savetl(s);
+            if (allocLevel==nLevel) {       // including initial time when allocLevel==nLevel==0
+              SEXP *tt = NULL;
+              if (allocLevel<INT_MAX) {
+                int64_t new = (int64_t)allocLevel+n-k+1024; // if all remaining levels in this item haven't been seen before, plus 1024 margin in case of many very short levels
+                allocLevel = (new>(int64_t)INT_MAX) ? INT_MAX : (int)new;
+                tt = (SEXP *)realloc(levelsRaw, allocLevel*sizeof(SEXP));  // first time levelsRaw==NULL and realloc==malloc in that case
+              }
+              if (tt==NULL) {
+                // # nocov start
+                // C spec states that if realloc() fails (above) the original block (levelsRaw) is left untouched: it is not freed or moved. We ...
+                for (int k=0; k<nLevel; k++) SET_TRUELENGTH(levelsRaw[k], 0);   // ... rely on that in this loop which uses levelsRaw.
+                free(levelsRaw);
+                savetl_end();
+                error("Failed to allocate working memory for %d factor levels of result column %d when reading item %d of item %d", allocLevel, idcol+j+1, w+1, i+1);
+                // # nocov end
+              }
+              levelsRaw = tt;
+            }
+            SET_TRUELENGTH(s,-(++nLevel));
+            levelsRaw[nLevel-1] = s;
+          }
+          int *targetd = INTEGER(target);
+          if (isFactor(thisCol)) {
+            // loop through levels. If all i == truelength(i) then just do a memcpy. Otherwise hop via the integer map.
+            bool nohop = true;
+            for (int k=0; k<n; ++k) if (-TRUELENGTH(thisColStrD[k]) != k+1) { nohop=false; break; }
+            if (nohop) memcpy(targetd+ansloc, INTEGER(thisCol), thisnrow*SIZEOF(thisCol));
+            else {
+              int *id = INTEGER(thisCol);
+              for (int r=0; r<thisnrow; r++)
+                targetd[ansloc+r] = id[r]==NA_INTEGER ? NA_INTEGER : -TRUELENGTH(thisColStrD[id[r]-1]);
+            }
+          } else {
+            SEXP *sd = STRING_PTR(thisColStr);
+            for (int r=0; r<thisnrow; r++) targetd[ansloc+r] = sd[r]==NA_STRING ? NA_INTEGER : -TRUELENGTH(sd[r]);
           }
-          // removed 'coerced=FALSE; UNPROTECT(1)' as it resulted in a stack imbalance.
-          // anyways it's taken care of after the switch. So no need here.
         }
-        break;
-      case VECSXP :
-        for (r=0; r<thislen; r++)
-          SET_VECTOR_ELT(target, ansloc+r, VECTOR_ELT(thiscol,r));
-        break;
-      case CPLXSXP : // #1659 fix
-        for (r=0; r<thislen; r++)
-          COMPLEX(target)[ansloc+r] = COMPLEX(thiscol)[r];
-        break;
-      case REALSXP:
-        memcpy(REAL(target)+ansloc, REAL(thiscol), thislen*SIZEOF(thiscol));
-        break;
-      case INTSXP:
-        memcpy(INTEGER(target)+ansloc, INTEGER(thiscol), thislen*SIZEOF(thiscol));
-        break;
-      case LGLSXP:
-        memcpy(LOGICAL(target)+ansloc, LOGICAL(thiscol), thislen*SIZEOF(thiscol));
-        break;
-      default :
-        error("Unsupported column type '%s'", type2char(TYPEOF(target)));
-      }
-      ansloc += thislen;
-      if (coerced) {
-        UNPROTECT(1);
-        coerced = FALSE;
+        ansloc += thisnrow;
       }
-    }
-    if (data.is_factor[j]) {
-      SEXP finalFactorLevels = PROTECT(combineFactorLevels(factorLevels, &(data.is_factor[j]), isRowOrdered));
-      SEXP factorLangSxp = PROTECT(lang3(install(data.is_factor[j] == 1 ? "factor" : "ordered"),
-                         target, finalFactorLevels));
-      SET_VECTOR_ELT(ans, j+isidcol, eval(factorLangSxp, R_GlobalEnv));
-      UNPROTECT(2);  // finalFactorLevels, factorLangSxp
-    }
-  }
-
-  // fix for #1432, + more efficient to move the logic to C
-  if (isidcol) {
-    R_len_t runidx = 1, cntridx = 0;
-    SEXP lnames = getAttrib(l, R_NamesSymbol);
-    if (isNull(lnames)) {
-      SET_VECTOR_ELT(ans, 0, target=allocVector(INTSXP, data.n_rows) );
-      for (i=0; i<LENGTH(l); i++) {
-        for (j=0; j<data.fn_rows[i]; j++)
-          INTEGER(target)[cntridx++] = runidx;
-        runidx++;
+      for (int k=0; k<nLevel; ++k) SET_TRUELENGTH(levelsRaw[k], 0);
+      savetl_end();
+      if (warnStr[0]) warning(warnStr);  // now savetl_end() has happened it's safe to call warning (could error if options(warn=2))
+      SEXP levelsSxp;
+      setAttrib(target, R_LevelsSymbol, levelsSxp=allocVector(STRSXP, nLevel));
+      for (int k=0; k<nLevel; ++k) SET_STRING_ELT(levelsSxp, k, levelsRaw[k]);
+      free(levelsRaw);
+      if (orderedFactor) {
+        SEXP tt;
+        setAttrib(target, R_ClassSymbol, tt=allocVector(STRSXP, 2));
+        SET_STRING_ELT(tt, 0, char_ordered);
+        SET_STRING_ELT(tt, 1, char_factor);
+      } else {
+        setAttrib(target, R_ClassSymbol, ScalarString(char_factor));
       }
     } else {
-      SET_VECTOR_ELT(ans, 0, target=allocVector(STRSXP, data.n_rows) );
-      for (i=0; i<LENGTH(l); i++) {
-        for (j=0; j<data.fn_rows[i]; j++)
-          SET_STRING_ELT(target, cntridx++, STRING_ELT(lnames, i));
+      for (int i=0; i<LENGTH(l); ++i) {
+        const int thisnrow = eachMax[i];
+        if (thisnrow==0) continue;
+        SEXP li = VECTOR_ELT(l, i);
+        int w = usenames ? colMap[i*ncol + j] : j;
+        SEXP thisCol;
+        if (w==-1 || !length(thisCol=VECTOR_ELT(li, w))) {
+          writeNA(target, ansloc, thisnrow);  // writeNA is integer64 aware and writes INT64_MIN
+        } else {
+          const char *ret = memrecycle(target, R_NilValue, ansloc, thisnrow, thisCol);  // coerces if needed within memrecycle; possibly with a no-alloc direct coerce
+          if (ret) warning("Column %d of item %d: %s", w+1, i+1, ret);  // currently just one warning when precision is lost; e.g. assigning 3.4 to integer64
+        }
+        ansloc += thisnrow;
       }
     }
   }
-
-  UNPROTECT(protecti);
+  if (!isNull(coercedForFactor)) UNPROTECT(1);
+  UNPROTECT(1);  // ans
   return(ans);
 }
 
-/*
-## The section below implements "chmatch2_old" and "chmatch2" (faster version of chmatch2_old).
-## It's basically 'pmatch' but without the partial matching part. These examples should
-## make it clearer.
-## Examples:
-## chmatch2_old(c("a", "a"), c("a", "a"))     # 1,2  - the second 'a' in 'x' has a 2nd match in 'table'
-## chmatch2_old(c("a", "a"), c("a", "b"))     # 1,NA - the second one doesn't 'see' the first 'a'
-## chmatch2_old(c("a", "a"), c("a", "a.1"))   # 1,NA - differs from 'pmatch' output = 1,2
-##
-## The algorithm: given 'x' and 'y':
-## dt = data.table(val=c(x,y), grp1 = rep(1:2, c(length(x),length(y))), grp2=c(1:length(x), 1:length(y)))
-## dt[, grp1 := 0:(.N-1), by="val,grp1"]
-## dt[, grp2[2], by="val,grp1"]
-##
-## NOTE: This is FAST, but not AS FAST AS it could be. See chmatch2 for a faster implementation (and bottom
-## of this file for a benchmark). I've retained here for now. Ultimately, will've to discuss with Matt and
-## probably export it??
-*/
-SEXP chmatch2_old(SEXP x, SEXP table, SEXP nomatch) {
-
-  R_len_t i, j, k, nx, li, si, oi;
-  if (TYPEOF(nomatch) != INTSXP || length(nomatch) != 1) error("'nomatch' must be an integer of length 1");
-  if (!length(x) || isNull(x)) return(allocVector(INTSXP, 0));
-  if (TYPEOF(x) != STRSXP) error("'x' must be a character vector");
-  nx=length(x);
-  if (!length(table) || isNull(table)) {
-    SEXP ans = PROTECT(allocVector(INTSXP, nx));
-    for (i=0; i<nx; i++) INTEGER(ans)[i] = INTEGER(nomatch)[0];
-    UNPROTECT(1);
-    return(ans);
-  }
-  if (TYPEOF(table) != STRSXP) error("'table' must be a character vector");
-  // Done with special cases. On to the real deal.
-
-  SEXP tt = PROTECT(allocVector(VECSXP, 2));
-  SET_VECTOR_ELT(tt, 0, x);
-  SET_VECTOR_ELT(tt, 1, table);
-  SEXP dt = PROTECT(unlist2(tt));
-
-  // order - first time
-  SEXP order = PROTECT(fast_order(dt, 2, 1));
-  SEXP start = getAttrib(order, sym_starts);
-  SEXP lens  = PROTECT(uniq_lengths(start, length(order))); // length(order) = nrow(dt)
-  SEXP grpid = VECTOR_ELT(dt, 1);
-  SEXP index = VECTOR_ELT(dt, 2);
-
-  // replace dt[1], we don't need it anymore
-  k=0;
-  for (i=0; i<length(lens); i++) {
-    for (j=0; j<INTEGER(lens)[i]; j++) {
-      INTEGER(grpid)[INTEGER(order)[k+j]-1] = j;
-    }
-    k += j;
-  }
-  // order - again
-  order = PROTECT(fast_order(dt, 2, 1));
-  start = getAttrib(order, sym_starts);
-  lens  = PROTECT(uniq_lengths(start, length(order)));
-
-  SEXP ans = PROTECT(allocVector(INTSXP, nx));
-  k = 0;
-  for (i=0; i<length(lens); i++) {
-    li = INTEGER(lens)[i];
-    si = INTEGER(start)[i]-1;
-    oi = INTEGER(order)[si]-1;
-    if (oi > nx-1) continue;
-    INTEGER(ans)[oi] = (li == 2) ? INTEGER(index)[INTEGER(order)[si+1]-1]+1 : INTEGER(nomatch)[0];
-  }
-  UNPROTECT(7);
-  return(ans);
-}
-
-// utility function used from within chmatch2
-static SEXP listlist(SEXP x) {
-
-  R_len_t i,j,k, nl;
-  SEXP lx, xo, xs, xl, tmp, ans, ans0, ans1;
-
-  lx = PROTECT(allocVector(VECSXP, 1));
-  SET_VECTOR_ELT(lx, 0, x);
-  xo = PROTECT(fast_order(lx, 1, 1));
-  xs = getAttrib(xo, sym_starts);
-  xl = PROTECT(uniq_lengths(xs, length(x)));
-
-  ans0 = PROTECT(allocVector(STRSXP, length(xs)));
-  ans1 = PROTECT(allocVector(VECSXP, length(xs)));
-  k=0;
-  for (i=0; i<length(xs); i++) {
-    SET_STRING_ELT(ans0, i, STRING_ELT(x, INTEGER(xo)[INTEGER(xs)[i]-1]-1));
-    nl = INTEGER(xl)[i];
-    SET_VECTOR_ELT(ans1, i, tmp=allocVector(INTSXP, nl) );
-    for (j=0; j<nl; j++) {
-      INTEGER(tmp)[j] = INTEGER(xo)[k+j];
-    }
-    k += j;
-  }
-  ans = PROTECT(allocVector(VECSXP, 2));
-  SET_VECTOR_ELT(ans, 0, ans0);
-  SET_VECTOR_ELT(ans, 1, ans1);
-  UNPROTECT(6);
-  return(ans);
-}
-
-/*
-## While chmatch2_old works great, I find it inefficient in terms of both memory (stores 2 indices over the
-## length of x+y) and speed (2 ordering and looping over unnecesssary amount of times). So, here's
-## another stab at a faster version of 'chmatch2_old', leveraging the power of 'chmatch' and data.table's
-## DT[ , list(list()), by=.] syntax.
-##
-## The algorithm:
-## x.agg = data.table(x)[, list(list(rep(x, .N))), by=x]
-## y.agg = data.table(y)[, list(list(rep(y, .N))), by=y]
-## mtch  = chmatch(x.agg, y.agg, nomatch)                 ## here we look at only unique values!
-## Now, it's just a matter of filling corresponding matches from x.agg's indices with y.agg's indices.
-## BENCHMARKS ON THE BOTTOM OF THIS FILE
-*/
-SEXP chmatch2(SEXP x, SEXP y, SEXP nomatch) {
-
-  R_len_t i, j, k, nx, ix, iy;
-  SEXP xll, yll, xu, yu, ans, xl, yl, mx;
-  if (TYPEOF(nomatch) != INTSXP || length(nomatch) != 1) error("'nomatch' must be an integer of length 1");
-  if (!length(x) || isNull(x)) return(allocVector(INTSXP, 0));
-  if (TYPEOF(x) != STRSXP) error("'x' must be a character vector");
-  nx = length(x);
-  if (!length(y) || isNull(y)) {
-    ans = PROTECT(allocVector(INTSXP, nx));
-    for (i=0; i<nx; i++) INTEGER(ans)[i] = INTEGER(nomatch)[0];
-    UNPROTECT(1);
-    return(ans);
-  }
-  if (TYPEOF(y) != STRSXP) error("'table' must be a character vector");
-  // Done with special cases. On to the real deal.
-  xll = PROTECT(listlist(x));
-  yll = PROTECT(listlist(y));
-
-  xu = VECTOR_ELT(xll, 0);
-  yu = VECTOR_ELT(yll, 0);
-
-  mx  = PROTECT(chmatch(xu, yu, 0, FALSE));
-  ans = PROTECT(allocVector(INTSXP, nx));
-  k=0;
-  for (i=0; i<length(mx); i++) {
-    xl = VECTOR_ELT(VECTOR_ELT(xll, 1), i);
-    ix = length(xl);
-    if (INTEGER(mx)[i] == 0) {
-      for (j=0; j<ix; j++)
-        INTEGER(ans)[INTEGER(xl)[j]-1] = INTEGER(nomatch)[0];
-    } else {
-      yl = VECTOR_ELT(VECTOR_ELT(yll, 1), INTEGER(mx)[i]-1);
-      iy = length(yl);
-      for (j=0; j < ix; j++)
-        INTEGER(ans)[INTEGER(xl)[j]-1] = (j < iy) ? INTEGER(yl)[j] : INTEGER(nomatch)[0];
-      k += ix;
-    }
-  }
-  UNPROTECT(4);
-  return(ans);
-
-}
-
-/*
-## Benchmark:
-set.seed(45L)
-x <- sample(letters, 1e6, TRUE)
-y <- sample(letters, 1e7, TRUE)
-system.time(ans1 <- .Call("Cchmatch2_old", x,y,0L)) # 2.405 seconds
-system.time(ans2 <- .Call("Cchmatch2", x,y,0L)) # 0.174 seconds
-identical(ans1, ans2) # [1] TRUE
-## Note: 'pmatch(x,y,0L)' dint finish still after about 5 minutes, so stopped.
-## Speed up of about ~14x!!! nice ;). (And this uses lesser memory as well).
-*/
diff --git a/src/subset.c b/src/subset.c
index 11172eca1f..40563cad2d 100644
--- a/src/subset.c
+++ b/src/subset.c
@@ -270,6 +270,7 @@ SEXP subsetDT(SEXP x, SEXP rows, SEXP cols) {
     ansn = LENGTH(rows);  // has been checked not to contain zeros or negatives, so this length is the length of result
     for (int i=0; i<LENGTH(cols); i++) {
       SEXP source = VECTOR_ELT(x, INTEGER(cols)[i]-1);
+      if (isNull(source)) error("Internal error: column %d of data.table is NULL; malformed", i+1);
       SEXP target;
       SET_VECTOR_ELT(ans, i, target=allocVector(TYPEOF(source), ansn));
       copyMostAttrib(source, target);
@@ -292,7 +293,7 @@ SEXP subsetDT(SEXP x, SEXP rows, SEXP cols) {
   // but maintain key if ordered subset
   SEXP key = getAttrib(x, sym_sorted);
   if (length(key)) {
-    SEXP in = PROTECT(chmatch(key,getAttrib(ans,R_NamesSymbol), 0, TRUE)); nprotect++; // (nomatch ignored when in=TRUE)
+    SEXP in = PROTECT(chin(key, getAttrib(ans,R_NamesSymbol))); nprotect++;
     int i = 0;  while(i<LENGTH(key) && LOGICAL(in)[i]) i++;
     // i is now the keylen that can be kept. 2 lines above much easier in C than R
     if (i==0 || !orderedSubset) {
@@ -313,9 +314,10 @@ SEXP subsetDT(SEXP x, SEXP rows, SEXP cols) {
 SEXP subsetVector(SEXP x, SEXP idx) { // idx is 1-based passed from R level
   bool anyNA=false, orderedSubset=false;
   int nprotect=0;
-  if (check_idx(idx, length(x), &anyNA, &orderedSubset) != NULL) {
+  if (isNull(x))
+    error("Internal error: NULL can not be subset. It is invalid for a data.table to contain a NULL column.");      // # nocov
+  if (check_idx(idx, length(x), &anyNA, &orderedSubset) != NULL)
     error("Internal error: CsubsetVector is internal-use-only but has received negatives, zeros or out-of-range");  // # nocov
-  }
   SEXP ans = PROTECT(allocVector(TYPEOF(x), length(idx))); nprotect++;
   copyMostAttrib(x, ans);
   subsetVectorRaw(ans, x, idx, anyNA);
diff --git a/src/wrappers.c b/src/wrappers.c
index 61b3b68cae..7669a320a6 100644
--- a/src/wrappers.c
+++ b/src/wrappers.c
@@ -38,7 +38,7 @@ SEXP setlevels(SEXP x, SEXP levels, SEXP ulevels) {
   xchar = PROTECT(allocVector(STRSXP, nx));
   for (i=0; i<nx; i++)
     SET_STRING_ELT(xchar, i, STRING_ELT(levels, INTEGER(x)[i]-1));
-  newx = PROTECT(chmatch(xchar, ulevels, NA_INTEGER, FALSE));
+  newx = PROTECT(chmatch(xchar, ulevels, NA_INTEGER));
   for (i=0; i<nx; i++) INTEGER(x)[i] = INTEGER(newx)[i];
   setAttrib(x, R_LevelsSymbol, ulevels);
   UNPROTECT(2);
@@ -50,13 +50,6 @@ SEXP copy(SEXP x)
   return(duplicate(x));
 }
 
-SEXP copyattr(SEXP from, SEXP to)
-{
-  // for use by [.data.table to retain attribs such as "comments" when subsetting and j is missing
-  copyMostAttrib(from, to);
-  return(R_NilValue);
-}
-
 SEXP setlistelt(SEXP l, SEXP i, SEXP value)
 {
   R_len_t i2;