Friday 27 December 2013

Copy full path for current file in Total Commander

Add keystroke for the cm_CopyFullNamesToClip command (I use Ctrl+P):


To put the same into TC command field, use Ctrl+Shift+Enter.

Run Visual Studio under Admin always

Of course you can set "Microsoft Visual Studio 2010" link properties in the Start menu, but it doesn't suffice if you want to open your sln-files from Explorer in Admin mode. So set Admin rights for c:\Program Files (x86)\Common Files\microsoft shared\MSENV\VSLauncher.exe. If you don't do this, you will get something like "do you want to save changes to devenv.sln?" after opening the solution.

Tuesday 3 December 2013

Use HunSpell in .NET and speed it up

To use HunSpell for Russian.

1) Take a demo-archive from NHunSpell.
2) Take AOT-based dictionary from LibreOffice. Open oxt-file with 7zip, take .dic and .aff files.
3) Load them in CSharpConsoleSample program, use the demo to test.
4) Update to latest DLLs.

The speed of the solution is acceptable in context suggesters, but not in search applications. HunSpell provides rich lists of variants, while we need short, precise lists. To achieve it, deactivate ngram-based suggesting by editing aff-file: add
MAXNGRAMSUGS 0
after
SET KOI8-R
This increases the speed dramatically (from 2 secs to several msecs for the list of 5 misspelled words), but you can get empty lists on some nontrivial typos.

Tuesday 12 November 2013

Some strange problems with MS Word

Living with fresh installations of MS Word of different versions for a while, you can face strange problems. E.g., "File not found" window from VBA while creating default document in Office 2013, or loosing some panels from, e.g., ORFO, in Office 2003. One action will help you figure it out. Close Word, go to c:\Users\%UserName%\AppData\Roaming\Microsoft and rename Templates dir to something else. When starting Word once again, it will recreate the default Templates folder, this should fix a wast majority of "strange" problems.

Friday 8 November 2013

Precise time in Windows batch

The well-known time /T prints only HH:MM. But we actually have microseconds in win-batches. To have them printed, use this: echo %time%. Use this variable any way you want.

Thursday 7 November 2013

UTF-8 locales in FreeBSD, Linux and Windows

1. FreeBSD

We use FreeBSD 8.3 and gcc 4.6.4, work with the OS terminal in putty.
In your home directory, edit .login_conf:

me:\
:charset=UTF-8:\
:lang=ru_RU.UTF-8:\
:setenv=LC_COLLATE=C:

Restart all putty instances.
In putty, set Change settings... → Translation → Remote character set = UTF-8.
Check locale in terminal, it should be:

$ locale
LANG=ru_RU.UTF-8
LC_CTYPE="ru_RU.UTF-8"
LC_COLLATE=C
LC_TIME="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_ALL=

In your C++ program, use UTF-8 natively, as char*, putting it to std::cout. You can convert utf8 strings to utf16 and print them to wcout, actually you'll get the same output for utf8 strings as with cout. You don't need to call setlocale (LC_ALL, ""); from <clocale>.

Note: for those who don't like Russian UI-s, like me. You can set en_EN.UTF-8 instead of ru_RU.UTF-8 in .login_conf to have native messages in English. The effect to output correctness will be just the same.

2. Windows

There is a bulk of approaches to overcome the lack of UTF-8 support in Windows console. I have the following "lazy" approach successful, also it's easily portable to Unix-systems. With this approach, you can have everything inside your soft in UTF-8, and convert strings from UTF-8 to UTF-16 by the portable utf8cpp library only to output them to console.

It's pretty straightforward. In the very start of your program, add:
setlocale (LC_ALL, "Russian");
Use wcin and wcout, convert UTF-8 strings to UTF-16 with utf8::utf8to16 when outputting.
No need to make chcp in console!

It's portable to FreeBSD by the following mean. In FreeBSD, which is tuned like described earlier, you can get the same output with UTF-16's wcout as with UTF-8's cout. Just try it, but don't forget to add setlocale (LC_ALL, "") or setlocale (LC_ALL, "ru_RU.UTF-8"), for wcout-printing it's important in FreeBSD.

3. Linux

You can perform exactly like in FreeBSD. I did the check on fresh Ubuntu Server 13.04 with gcc 4.7.3 on board, and there was a need to install Russian locale. Firstly, check what locales you do already have, with locale -a. Don't let the "utf8" name part confuse you, using "UTF-8" everywhere is just right. It holds through all Unix-based systems. Secondly, if you don't have Russian locales, here is how to to add them:

sudo locale-gen ru_RU
sudo locale-gen ru_RU.UTF-8
sudo dpkg-reconfigure locales
sudo update-locale

In comparison to the FreeBSD experience, bear in mind one very important thing: avoid mixing together fprintf/wprintf, cout/wcout, etc. E.g., after setting your locale, if you use cout, and then wcout, the latter would print you junk. Remove all cout uses, and wcout starts to work properly. Actually, one should always avoid mixing w and not-w versions of output together, it's a good rule of thumb. Nevertheless, this bad pattern passes okay (or doesn't reveal any errors) in Windows and FreeBSD consoles, but not in Linux console, which is undoubtedly right behavior in general.

Note: in VMWare window, the localized output is crappy: some symbols are ok, while others look like ◊. I don't know how to deal with this, for now I have to attach to my VM through putty. As an advantage, I have mouse and Russian keyboard inputs working.

4. Combining across OS-es

In general, LC_ALL is not a good practice, yet it works :)

There are two ways to deal with it.

Using setlocale in Unix-OS, easily Windows-portable

1) Call setlocale (LC_ALL, "Russian") at the start of the program in Windows, setlocale (LC_ALL, "ru_RU.UTF-8") in FreeBSD/Linux
2) Write to console with wcout, with converting output from utf8 with utf8::utf8to16
3)  Don't use cout at all

Without setlocale in Unix-OS, not-so-easily Windows-portable

1) Use cout in Unix-based systems, wcout in Windows
2) Convert utf8 to utf16 with utf8::utf8to16 only in Windows
2) Call setlocale (LC_ALL, "Russian") at the start only in Windows

The same is with printing to streams (file stream, piping output, stderr).

One can overcome the disadvantage of LC_ALL with such function:

void attach_to_rus_locale (std::ios_base& stream)
{
    std::locale loc ("ru_RU.utf8");
    stream.imbue (loc);
}

and pass there std::wcout, file streams and so on. Unfortunately, utf8::utf8to16 is somehow affected by the locale also. Someone should figure out how to deal with that.

Wednesday 6 November 2013

SVN E000022

svn: E000022: Can't convert string from 'UTF-8' to native encoding
Cure it with this:
Linux → setenv LC_CTYPE en_US.UTF-8
FreeBSD → export LC_CTYPE=en_US.UTF-8

Monday 4 November 2013

Unix terminal font for Visual Studio

For me, default fonts for Visual Studio have always been looking too thin.

1. Courier New, 11


2. Consolas, 10


3. Anonymous Pro, 12 (my favorite one with ClearType)


But I think I've found the solution – Text Sharp + Classic Console 15 pt:


Just do this:


And here you go with crisp, thick, compact and classic Unix-terminal font in the most modern environments. And, yes, it has Cyrillic chars inside. It gives you precisely the same look and feel as in Linux terminal:


Also you can choose Fixedsys Excelior, 12


But it's not so Unix-style, more of Windows-style. Looks heavier than Classic Console, but actually has the same height. Though, details of Classic Console are much nicer to me, they bring me back to the mellow days of coding in 80x25 QuickBasic 4.5 at my 12 :)

P.S. Kind regards for SZÉLL Csaba, the author of Classic Console. He has kindly fixed some minor problems with his font that I had noticed, and now it's good not only for cmd window, but for VS also.

Friday 1 November 2013

Ubuntu Server in VMWare

Tune the screen resolution

In /etc/default/grub change:
GRUB_CMDLINE_LINUX_DEFAULT="splash vga=792"
or you can choose from vga=786, 789, 792, 795, 799
then delete
GRUB_CMDLINE_LINUX=""
and do
sudo update-grub
sudo reboot

Install GCC

sudo apt-get clean
sudo apt-get update 
sudo apt-get upgrade -f
sudo apt-get install gcc
sudo apt-get install g++
sudo apt-get install make

Set SSH access from host

Under sudo, my user is notroot, vmware image from thoughtpolice.
sudo apt-get install openssh-server
/etc/init.d/ssh restart
ifconfig #watch IP-address
usermod -U notroot

Thursday 24 October 2013

Visual C++: unnecessary rebuilds

You hit F5, but one of your (often third-party) solution projects wants to be built, while there is no changes inside. It annoys you, run after run. The solution is the following: check if physically non-existing source files are included in the project. Remove them, and with high probability everything will become ok.

Add: also see http://rdaemons.blogspot.se/2011/01/visual-c-2010-up-to-date-project-always.html.

Thursday 17 October 2013

False commas aren't like 10 years ago

This phenomenon in Russian language has always been recalled in a context of dashes and marks of omission. Also, one can name this article from 2004 about replacing slang and obscenes with commas. Since the end of 2012, we started to notice that such commas are almost everywhere, not only in blogs and social networks, but also in online press and offline ads.

Here is the evidence of the last 2 months:

2013-11-10:
В сентябре, Владимир Путин также обмолвился, что такой поворот событий не исключен
Приехавшие к месту убийства правоохранители, просмотрели видеозапись и опросили потерпевшую
2013-10-01: Ваша цель, слушать час!
2013-09-27: Плотность пикселей будет тогда больше чем, в ретине.
2013-09-25: Спикер Совета Федерации Валентина Матвиенко, предложила обсудить возможность возвращения в избирательные бюллетени графы "против всех"

Wednesday 9 October 2013

A strange Windows update

In our stack, we have a rule base compiler which is an exe-file that generates bat-file and calls it. Yesterday, right after installing monthly updates from Microsoft, this compiler has stopped working properly on some of our Windows 7 sp1 x64 machines, throwing the «Applcation has stopped working» window at the moment of calling bat-file from exe. Bat-file ran fine separately. Neither UAC no Security Essentials deactivating helped, as well as running the stuff as Administrator. Finally, we got the soft working right after setting "Compatibility: Windows 7" for our exe. It seems that Microsoft has shipped a strange security update which treats exe-s that run bat-s as "bad". Our bat actually read registry, because it ran vsvars32.bat. Maybe an MS intern plays pranks? :) We didn't figure out what exactly the update it was, hope they'll fix the problem in a couple of weeks.

Monday 7 October 2013

Build Rus, Eng, Ger AOT morpho for .NET

1) Download trunk from http://seman.sourceforge.net/. I check out it to p:\SEMAN\. Build Debug. Don't try to build other configs, they will fail even worse than in Debug configuration.
2) Download http://aot.ru/download.php — RusLemmatizer.zip and MorphWizard.zip. Install with default paths.
3) Delete all from c:\Rml\Dicts\.
4) Copy p:\SEMAN\Dicts\Morph\ to c:\Rml\Dicts\.
5) Copy p:\SEMAN\Dicts\SrcMorph\ to To c:\Rml\Dicts\.
6) From p:\SEMAN\Source\MorphGen\Debug take MorphGen.exe and replace it into c:\Rml\Bin\.
7) Now just run eng_gen.bat, ger_gen.bat and rus_gen.bat. It is slow, schedule 2-4 hours.
8) Use the resulting binaries with Lemmatizer.NET, which is compilable from p:\SEMAN\Source\LemmatizerNET.sln .
9) But introduce a little workaround to Lemmatizer.NET. In Lemmatizer.cs, in LoadDictionariesRegistry function, replace the following:

_useStatistic = true;
_statistic.Load(this, "l", manager);

to:

if (Language == InternalMorphLanguage.morphRussian)
{
   _useStatistic = true;
   _statistic.Load(this, "l", manager);
}
else
{
   _useStatistic = false;
}

10) To work with dictionaries, make separate folder like p:\Lemmatize\Rml and put there c:\Rml\Bin and c:\Rml\Dicts.
11) The minimal test program to do lemmatizing is the following.

class Program
{
  static void Main(string[] args)
  {
    Console.WriteLine("Enter Russian word (0 to exit)");
    ILemmatizer lem = LemmatizerFactory.
      Create(MorphLanguage.Russian);
    var manager = FileManager.
      GetFileManager(@"p:\Lemmatize\Rml"); // make it relative!
    lem.LoadDictionariesRegistry(manager);
    string word;
    do
    {
      Console.Write("> ");
      word = Console.ReadLine();
      var paradigmList = lem.
        CreateParadigmCollectionFromForm(word, false, true);
      for (var i = 0; i < paradigmList.Count; i++)
      {
        var paradigm = paradigmList[i];
        Console.WriteLine ("\t" + paradigm.Norm);
      }
    }
    while (word != "0");
  }
}

Friday 4 October 2013

File assocs grab between versions of MS Office

When, e.g., Office 2010 tries to take file associations from Office 2003 previously installed, and then back, and then back. Use this batch:

set P=HKCU\Software\Microsoft\Office

reg add %P%\11.0\Word\Options /v NoReReg /t REG_DWORD /d 1
reg add %P%\12.0\Word\Options /v NoReReg /t REG_DWORD /d 1
reg add %P%\15.0\Word\Options /v NoReReg /t REG_DWORD /d 1

reg add %P%\11.0\Excel\Options /v NoReReg /t REG_DWORD /d 1
reg add %P%\12.0\Excel\Options /v NoReReg /t REG_DWORD /d 1
reg add %P%\15.0\Excel\Options /v NoReReg /t REG_DWORD /d 1

... the same for PowerPoint, Access ...

1) Substitute here certain versions of MS Office (maybe 14.0, there is no 13.0, I think) from your machine.
2) Check whether other programs like OneNote have such branches in registry, I haven't checked.
3) Never use spaces around "=" sign in set operation :)

Thursday 3 October 2013

Restore file associations and icons under Windows

Soft for Windows likes to override icons and extentions assocs so that it's so difficult to bring them to normal state. For instance, you can get crappy icons on Word files after reinstallation of MS Office. The solution to this mess is damn simple.

Backup.bat
FTYPE > backup_types.txt
ASSOC > backup_ext.txt

Restore.bat
FOR / F "tokens =* delims =" %%G IN (backup_types.txt) DO FTYPE %%G
FOR / F "tokens =* delims =" %%G IN (backup_ext.txt) DO ASSOC %%G

Don't forget to generate and put to safe place those two files after fresh install of Windows and soft. For those who is in search for the solution like I did before, here is the list of keywords to let Google find the post:
  • restore file associations Windows
  • restore icons Windows
  • restore exntention associations Windows
  • how to restore file associations Windows
  • restore file types Windows
  • recover file associations Windows

Wednesday 28 August 2013

MS Word facts, really useful

Store them for later reference.
  • Fast swap current paragraph with upper/lower: Alt+Shift+↑/↓
  • Select current sentence: Ctrl+MouseLeft (on any word)
  • Select current paragraph: tripple left mouse click
  • Move text: F2 on selection → Enter at the destination
  • Clear formatting of the selection: Ctrl+Space
  • =rand(N,M) — generate N paragraphs, each of M sentences from help
  • =lorem(N,M) — the same, but not random and from "Lorem Ipsum"
  • You can insert EPS images into Word document by "Insert picture"
  • Excel: Alt+Enter to enter new line in a current cell

Friday 16 August 2013

Batch to remove empty dirs in Windows

Staring from its directory: for /f "usebackq delims=" %%d in (`"dir /ad/b/s | sort /R"`) do rd "%%d"

Wednesday 7 August 2013

Tuesday 2 July 2013

Note on LaTeX in Windows

A convenient way of using LaTeX in Windows:
1) Use MikTeX package,
2) Install SumatraPDF,
3) Set MikTeX to show PS and PDF in SumatraPDF.
What it gives. SumatraPDF automatically updates the view of the document after its change on a disk. Important thing is that you point of focus in SumatraPDF doesn't change at all during the update — no scrolling, jumping to the first page, hitting Ctrl-R. So, you can recompile your LaTeX in MikTeX window, and you'll see updates in PS or PDF immediately, even without moving to SumatraPDF window. To enjoy this way fully, place MikTeX window on the right and SumatraPDF window on the left, and here we go.

Will post here some useful stuff on preparing ACL articles later.