Friday 27 December 2013

Copy full path for current file in Total Commander

Add keystroke for the cm_CopyFullNamesToClip command (I use Ctrl+P):


To put the same into TC command field, use Ctrl+Shift+Enter.

Run Visual Studio under Admin always

Of course you can set "Microsoft Visual Studio 2010" link properties in the Start menu, but it doesn't suffice if you want to open your sln-files from Explorer in Admin mode. So set Admin rights for c:\Program Files (x86)\Common Files\microsoft shared\MSENV\VSLauncher.exe. If you don't do this, you will get something like "do you want to save changes to devenv.sln?" after opening the solution.

Tuesday 3 December 2013

Use HunSpell in .NET and speed it up

To use HunSpell for Russian.

1) Take a demo-archive from NHunSpell.
2) Take AOT-based dictionary from LibreOffice. Open oxt-file with 7zip, take .dic and .aff files.
3) Load them in CSharpConsoleSample program, use the demo to test.
4) Update to latest DLLs.

The speed of the solution is acceptable in context suggesters, but not in search applications. HunSpell provides rich lists of variants, while we need short, precise lists. To achieve it, deactivate ngram-based suggesting by editing aff-file: add
MAXNGRAMSUGS 0
after
SET KOI8-R
This increases the speed dramatically (from 2 secs to several msecs for the list of 5 misspelled words), but you can get empty lists on some nontrivial typos.

Tuesday 12 November 2013

Some strange problems with MS Word

Living with fresh installations of MS Word of different versions for a while, you can face strange problems. E.g., "File not found" window from VBA while creating default document in Office 2013, or loosing some panels from, e.g., ORFO, in Office 2003. One action will help you figure it out. Close Word, go to c:\Users\%UserName%\AppData\Roaming\Microsoft and rename Templates dir to something else. When starting Word once again, it will recreate the default Templates folder, this should fix a wast majority of "strange" problems.

Thursday 7 November 2013

Precise time in Windows batch

The well-known time /T prints only HH:MM. But we actually have microseconds in win-batches. To have them printed, use this: echo %time%. Use this variable any way you want.

UTF-8 locales in FreeBSD, Linux and Windows

1. FreeBSD

We use FreeBSD 8.3 and gcc 4.6.4, work with the OS terminal in putty.
In your home directory, edit .login_conf:

me:\
:charset=UTF-8:\
:lang=ru_RU.UTF-8:\
:setenv=LC_COLLATE=C:

Restart all putty instances.
In putty, set Change settings... → Translation → Remote character set = UTF-8.
Check locale in terminal, it should be:

$ locale
LANG=ru_RU.UTF-8
LC_CTYPE="ru_RU.UTF-8"
LC_COLLATE=C
LC_TIME="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_ALL=

In your C++ program, use UTF-8 natively, as char*, putting it to std::cout. You can convert utf8 strings to utf16 and print them to wcout, actually you'll get the same output for utf8 strings as with cout. You don't need to call setlocale (LC_ALL, ""); from <clocale>.

Note: for those who don't like Russian UI-s, like me. You can set en_EN.UTF-8 instead of ru_RU.UTF-8 in .login_conf to have native messages in English. The effect to output correctness will be just the same.

2. Windows

There is a bulk of approaches to overcome the lack of UTF-8 support in Windows console. I have the following "lazy" approach successful, also it's easily portable to Unix-systems. With this approach, you can have everything inside your soft in UTF-8, and convert strings from UTF-8 to UTF-16 by the portable utf8cpp library only to output them to console.

It's pretty straightforward. In the very start of your program, add:
setlocale (LC_ALL, "Russian");
Use wcin and wcout, convert UTF-8 strings to UTF-16 with utf8::utf8to16 when outputting.
No need to make chcp in console!

It's portable to FreeBSD by the following mean. In FreeBSD, which is tuned like described earlier, you can get the same output with UTF-16's wcout as with UTF-8's cout. Just try it, but don't forget to add setlocale (LC_ALL, "") or setlocale (LC_ALL, "ru_RU.UTF-8"), for wcout-printing it's important in FreeBSD.

3. Linux

You can perform exactly like in FreeBSD. I did the check on fresh Ubuntu Server 13.04 with gcc 4.7.3 on board, and there was a need to install Russian locale. Firstly, check what locales you do already have, with locale -a. Don't let the "utf8" name part confuse you, using "UTF-8" everywhere is just right. It holds through all Unix-based systems. Secondly, if you don't have Russian locales, here is how to to add them:

sudo locale-gen ru_RU
sudo locale-gen ru_RU.UTF-8
sudo dpkg-reconfigure locales
sudo update-locale

In comparison to the FreeBSD experience, bear in mind one very important thing: avoid mixing together fprintf/wprintf, cout/wcout, etc. E.g., after setting your locale, if you use cout, and then wcout, the latter would print you junk. Remove all cout uses, and wcout starts to work properly. Actually, one should always avoid mixing w and not-w versions of output together, it's a good rule of thumb. Nevertheless, this bad pattern passes okay (or doesn't reveal any errors) in Windows and FreeBSD consoles, but not in Linux console, which is undoubtedly right behavior in general.

Note: in VMWare window, the localized output is crappy: some symbols are ok, while others look like ◊. I don't know how to deal with this, for now I have to attach to my VM through putty. As an advantage, I have mouse and Russian keyboard inputs working.

4. Combining across OS-es

In general, LC_ALL is not a good practice, yet it works :)

There are two ways to deal with it.

Using setlocale in Unix-OS, easily Windows-portable

1) Call setlocale (LC_ALL, "Russian") at the start of the program in Windows, setlocale (LC_ALL, "ru_RU.UTF-8") in FreeBSD/Linux
2) Write to console with wcout, with converting output from utf8 with utf8::utf8to16
3)  Don't use cout at all

Without setlocale in Unix-OS, not-so-easily Windows-portable

1) Use cout in Unix-based systems, wcout in Windows
2) Convert utf8 to utf16 with utf8::utf8to16 only in Windows
2) Call setlocale (LC_ALL, "Russian") at the start only in Windows

The same is with printing to streams (file stream, piping output, stderr).

One can overcome the disadvantage of LC_ALL with such function:

void attach_to_rus_locale (std::ios_base& stream)
{
    std::locale loc ("ru_RU.utf8");
    stream.imbue (loc);
}

and pass there std::wcout, file streams and so on. Unfortunately, utf8::utf8to16 is somehow affected by the locale also. Someone should figure out how to deal with that.

Wednesday 6 November 2013

SVN E000022

svn: E000022: Can't convert string from 'UTF-8' to native encoding
Cure it with this:
Linux → setenv LC_CTYPE en_US.UTF-8
FreeBSD → export LC_CTYPE=en_US.UTF-8