Skip to content

Support non-ASCII characters on Windows#130

Merged
sds merged 3 commits into
enkessler:masterfrom
james-stocks:CreateProcessW_skiptest
Jan 12, 2019
Merged

Support non-ASCII characters on Windows#130
sds merged 3 commits into
enkessler:masterfrom
james-stocks:CreateProcessW_skiptest

Conversation

@james-stocks

Copy link
Copy Markdown

Use CreateProcessW instead of CreateProcessA to create processes on Windows; allowing for non-ASCII characters in the command line and the environment.

This PR causes one spec test to be skipped for Windows when Ruby < 2.3; and it leaves a TODO: comment in the code of that test. For this case, the Ruby ENV hash cannot be used to check the child environment because it will not be reliably encoded/decoded by Ruby.
The TODO item is to re-write the test in a way that does not require ruby's ENV hash.

James Stocks added 2 commits October 24, 2017 15:00
Support wide characters on Windows. This allows for non-ASCII characters
in the command line.
@glennsarti

Copy link
Copy Markdown

@coveralls

coveralls commented Oct 30, 2017

Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.1%) to 94.993% when pulling c4f8e49 on james-stocks:CreateProcessW_skiptest into 0bce27d on enkessler:master.

@james-stocks

Copy link
Copy Markdown
Author

@glennsarti The limitation affects only the spec_helper code and not childprocess itself.

Following what Puppet does is one solution; an easier approach may be to have the test case avoid needing to check the environment e.g. have the test spawn a process that signals success only if the required environment value is set; and pass/fail the test on that signal.

In either case; I would prefer this PR to merge as-is if possible

Comment thread spec/childprocess_spec.rb

it 'allows unicode characters in the environment' do
# This test does not work on Windows for Ruby < 2.3 because ENV will not be properly decoded
# This was fixed in Ruby 2.3 here: https://github.com/ruby/ruby/commit/5e3467c4414df815b3b00d2b0372026b069e7f7d

@glennsarti glennsarti Oct 31, 2017

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that ruby only fixed using UTF8 characters in ENV VALUES not keys

If you change the test the code to

    Tempfile.open("env-spec") do |file|
      process = write_env(file.path)
      process.environment['baör'] = 'baör'
      process.start
      process.wait
      
      child_env = eval rewind_and_read(file)
      
      expect(child_env['baör'].force_encoding('UTF-8')).to eql 'baör'
    end

The test fails even on modern rubys

Inspecting child_env in pry you can see the corruption

[1] pry(#<RSpec::ExampleGroups::ChildProcess>)> child_env
=> {"ALLUSERSPROFILE"=>"C:\\ProgramData",
 "ANSICON"=>"185x2000 (185x65)",
 "ANSICON_DEF"=>"7",
 "APPDATA"=>"C:\\Users\\glenn.sarti\\AppData\\Roaming",
 "BUNDLER_VERSION"=>"1.15.4",
 "BUNDLE_BIN_PATH"=>"C:/tools/ruby2.3.1x64/lib/ruby/gems/2.3.0/gems/bundler-1.15.4/exe/bundle",
 "BUNDLE_GEMFILE"=>"C:/Source/tmp/childprocess/Gemfile",
 "ChocolateyInstall"=>"C:\\ProgramData\\chocolatey",
 "ChocolateyLastPathUpdate"=>"Sun Aug 27 19:31:25 2017",
 "ChocolateyToolsLocation"=>"C:\\tools",
...
"ba\xC3\xB6r"=>"ba\xC3\xB6r"}

Also note that this testing method mangles any existing UTF8 env vars
e.g. in PowerShell create a UTF8 envvar

PS> $ENV:UTF8ᚠᛇᚻ = 'UTF8ᚠᛇᚻ'
PS> dir ENV:

Name                           Value
----                           -----
ALLUSERSPROFILE                C:\ProgramData
ANSICON                        100x2000 (100x33)
ANSICON_DEF                    7
APPDATA                        C:\Users\glenn.sarti\AppData\Roaming
ChocolateyInstall              C:\ProgramData\chocolatey
...
UTF8ᚠᛇᚻ                        UTF8ᚠᛇᚻ

Then run the spec test and pry into child_env

...
 "USERNAME"=>"glenn.sarti",
 "USERPROFILE"=>"C:\\Users\\glenn.sarti",
 "UTF8???"=>"UTF8???",
 "windir"=>"C:\\WINDOWS",
 "ba\xC3\xB6r"=>"ba\xC3\xB6r"}

Note the UTF8 characters are now ? and the UTF8 from childprocess is now bytes. This has not been fixed in Ruby 2.4.x yet either.

This test will not currently work as the Ruby ENV hash has encoding limitations.
This is fixed in Ruby 2.3
(ruby/ruby@5e3467c)
@james-stocks james-stocks force-pushed the CreateProcessW_skiptest branch from c4f8e49 to 83c2a88 Compare November 1, 2017 15:42
@james-stocks

Copy link
Copy Markdown
Author

@glennsarti I pushed a minor change to the comment on the test, which makes your last comment "outdated" (thanks Github).

I believe the limitations you describe do not affect ChildProcess itself; but only this spec testing method for checking the environment.

Any objection to merging this PR, and I can file a ticket on this project to get the spec tests moved away from their reliance on the ENV hash?

@glennsarti

Copy link
Copy Markdown

I'm not a maintainer so I have no authority :-)

Only reason I would object is; You're changing the gem so it accepts non-ASCII (i.e. non US English text) but the testing fixtures aren't capable enough. Seems odd to add this without tests to prove it works and nothing regresses.

@glennsarti

Copy link
Copy Markdown

Perhaps a quick method would be to use a batch file or powershell to output the environment variables on Windows instead of using ruby ENV hash. If I get time I'll try something, but I have no rights on your branch or this repo.

@coveralls

coveralls commented Nov 1, 2017

Copy link
Copy Markdown

Coverage Status

Coverage decreased (-4.08%) to 90.795% when pulling 83c2a88 on james-stocks:CreateProcessW_skiptest into 0bce27d on enkessler:master.

@da-ar

da-ar commented Jan 8, 2018

Copy link
Copy Markdown
Contributor

@enkessler, I work for the same company as @james-stocks and I'm taking over responsibility for this PR from him.

On picking this up, I can confirm the following:

Is there anything you need / want to see in order for the merge to go ahead?

@enkessler

Copy link
Copy Markdown
Owner

Sorry about the delay in my addressing this PR.

@da-ar My main concern in switching from ANSI to Unicode is not breaking things and backward compatibility. So,

  1. Do Unicode chars now work?
  2. Do ANSI chars still work?
  3. Can we prove it?

Looking at the code changes in the PR and reading up a bit on CreateProcessA and CreateProcessW, things look reasonable enough. It's not my expertise and I'm not 100% on the number of terminating characters being slapped onto the strings, but there are tests in the project that use 'regular' chars and this PR has a test the uses 'fancy' chars and they both seem to be passing.

I'd say the only holdup is point 3. From the conversation in this PR, it sounds like there is a workaround that would allow the functionality to be tested on older Rubies as well. I'd rather that happen now rather than later.

@da-ar

da-ar commented Jan 10, 2018

Copy link
Copy Markdown
Contributor

I'll have a look at what needs to happen with the tests to prove this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants