<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Chris Needham's Blog</title>
  <link href="https://chrisneedham.com"/>
  <link type="application/atom+xml" rel="self" href="https://chrisneedham.com/atom.xml"/>
  <icon>https://chrisneedham.com/favicon.ico</icon>
  <updated>2023-08-07T17:00:56+01:00</updated>
  <id>https://chrisneedham.com/</id>
  <author>
    <name>Chris Needham</name>
    <email>chris@chrisneedham.com</email>
  </author>

  

  
    <entry>
      <id>https://chrisneedham.com/posts/2014/06/21/ruby-cron-utf8</id>
      <link type="text/html" rel="alternate" href="https://chris.needham/posts/2014/06/21/ruby-cron-utf8.html"/>
      <title>Ruby, cron, UTF-8, and the locale environment variables</title>
      <updated>2014-06-21T19:20:00+01:00</updated>
      <author>
        <name>Chris Needham</name>
      </author>
      <content type="html">&lt;p&gt;I&amp;rsquo;ve just spent an hour or so debugging a problem with one of our internal Web services at work, and I thought I&amp;rsquo;d share the details, in case anyone else comes across it.&lt;/p&gt;

&lt;p&gt;The problem was with a Ruby script running on an Ubuntu 12.04 server. The script runs under &lt;a href=&quot;https://manpages.ubuntu.com/manpages/precise/man8/cron.8.html&quot;&gt;cron&lt;/a&gt; and periodically requests data from a JSON Web service and adds it to an &lt;a href=&quot;https://lucene.apache.org/solr/&quot;&gt;Apache Solr&lt;/a&gt; data store. The program was written a few years ago, and had been running fine until just recently when it started failing to update Solr. The log file showed the program was exiting because of the following exception:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;/path/to/lib/ruby/2.0.0/json/common.rb:155:in `encode&apos;: &quot;\xC2&quot;
  on US-ASCII (Encoding::InvalidByteSequenceError)
  from /path/to/lib/ruby/2.0.0/json/common.rb:155:in `initialize&apos;
  from /path/to/lib/ruby/2.0.0/json/common.rb:155:in `new&apos;
  from /path/to/lib/ruby/2.0.0/json/common.rb:155:in `parse&apos;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The code in question looked like this:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-ruby&quot; data-lang=&quot;ruby&quot;&gt;&lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;RestClient&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;JSON&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;post_to_solr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;and the exception was coming from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;JSON.parse&lt;/code&gt; line.&lt;/p&gt;

&lt;p&gt;The exception indicated a character encoding problem, so I took a look at the data coming from the Web service to see if it was valid JSON, and investigate the 0xC2 value that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;JSON.parse&lt;/code&gt; was complaining about. This turned out to the byte sequence C2 A3, which in UTF-8 is the character &lt;a href=&quot;https://www.fileformat.info/info/unicode/char/a3/index.htm&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;POUND SIGN U+00A3&lt;/code&gt;&lt;/a&gt;. The question then was why the JSON parser should think the encoding is US-ASCII, as the Web service is returning UTF-8.&lt;/p&gt;

&lt;p&gt;I tried running the script manually to see if I could reproduce the error, but doing this the script worked fine. So, the problem wasn&amp;rsquo;t with the script, but something in its execution environment.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://manpages.ubuntu.com/manpages/precise/man1/locale.1.html&quot;&gt;locale&lt;/a&gt; program can be used to get information on the locale settings in use. When run from the shell it gave this output:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;LANG&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;en_GB.UTF-8
&lt;span class=&quot;nv&quot;&gt;LANGUAGE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;en_GB:en
&lt;span class=&quot;nv&quot;&gt;LC_CTYPE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_NUMERIC&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_TIME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_COLLATE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_MONETARY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_MESSAGES&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_PAPER&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_NAME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_ADDRESS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_TELEPHONE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_MEASUREMENT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_IDENTIFICATION&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en_GB.UTF-8&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_ALL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;but when run under cron it output this instead:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;LANG&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_CTYPE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_NUMERIC&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_TIME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_COLLATE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_MONETARY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_MESSAGES&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_PAPER&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_NAME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_ADDRESS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_TELEPHONE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_MEASUREMENT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_IDENTIFICATION&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POSIX&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;LC_ALL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The &lt;a href=&quot;https://manpages.ubuntu.com/manpages/precise/man7/locale.7.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;locale&lt;/code&gt;&lt;/a&gt; documentation and the &lt;a href=&quot;https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html&quot;&gt;POSIX specification&lt;/a&gt; has the details on these variables.&lt;/p&gt;

&lt;p&gt;Next I ran the script again, but with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LANG&lt;/code&gt; environment variable cleared:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ LANG=&quot;&quot; ./import-script
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Sure enough this reported the same error. Fixing the problem was simple: it was enough to set just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LANG&lt;/code&gt; in the &lt;a href=&quot;https://manpages.ubuntu.com/manpages/precise/man5/crontab.5.html&quot;&gt;crontab&lt;/a&gt; file:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;LANG=&quot;en_GB.UTF-8&quot;
0 * * * *       /path/to/import-script
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Problem solved, and users of this Web service are happy again :-)&lt;/p&gt;

</content>
    </entry>
  
    <entry>
      <id>https://chrisneedham.com/posts/2014/03/31/first-post</id>
      <link type="text/html" rel="alternate" href="https://chris.needham/posts/2014/03/31/first-post.html"/>
      <title>First post</title>
      <updated>2014-03-31T17:30:00+01:00</updated>
      <author>
        <name>Chris Needham</name>
      </author>
      <content type="html">&lt;p&gt;So, I finally got around to starting a blog. I&amp;rsquo;ve been meaning to do this for
a very long time. I&amp;rsquo;ve
&lt;a href=&quot;https://www.bbc.co.uk/rd/blog/2013/10/audio-waveforms&quot;&gt;written&lt;/a&gt;
&lt;a href=&quot;https://www.bbc.co.uk/rd/blog/2013/03/authentication-for-connected-tvs&quot;&gt;a&lt;/a&gt;
&lt;a href=&quot;https://www.bbc.co.uk/blogs/rad/2009/02/experiments_with_radiodns.html&quot;&gt;few&lt;/a&gt;
&lt;a href=&quot;https://www.bbc.co.uk/blogs/rad/2009/07/radiodns_demo_application_released.html&quot;&gt;posts&lt;/a&gt;
for the &lt;a href=&quot;https://www.bbc.co.uk/rd/blog&quot;&gt;company blog&lt;/a&gt; at work, so I figured I
really should have my own blog. And here it is.&lt;/p&gt;

&lt;p&gt;When deciding what tools to use for this, I looked at a few of the
static site generators currently available before finally settling on
&lt;a href=&quot;https://jekyllrb.com&quot;&gt;Jekyll&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m not a designer by any means, so for the blog&amp;rsquo;s stylesheet I searched around
for styles that I could &amp;ldquo;borrow&amp;rdquo;, and found the
&lt;a href=&quot;https://wordpress.org/themes/fanoe&quot;&gt;Fanoe&lt;/a&gt; Wordpress theme by &lt;a href=&quot;https://webdesign-florian-brinkmann.de/&quot;&gt;Florian
Brinkmann&lt;/a&gt;, which I&amp;rsquo;ve adapted
a bit.&lt;/p&gt;

&lt;p&gt;I wanted to use a CSS extension language, to be able
to use variables, and take advantage of better handling of nested rules than
plain CSS. I chose &lt;a href=&quot;https://sass-lang.com/&quot;&gt;SASS&lt;/a&gt; for no better reason than it
was the first one I tried out with Jekyll, the
&lt;a href=&quot;https://github.com/noct/jekyll-sass&quot;&gt;jekyll-sass&lt;/a&gt; plugin just worked, and
I was able to set config options to compress the generated CSS.&lt;/p&gt;

&lt;p&gt;The icons on the &lt;a href=&quot;/about&quot;&gt;About&lt;/a&gt; page come from &lt;a href=&quot;https://fontawesome.io&quot;&gt;Font
Awesome&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now I just need to find the time to write some more posts&amp;hellip;&lt;/p&gt;
</content>
    </entry>
  
</feed>
