2012-06-19

保護モードと IME

Windows アプリケーションに，統一的な "Protected Mode" (保護モード)の仕様や基準といったものはありません．保護モードと呼ばれている各技術の実装は製品ごとに異なります．A という製品の「保護モード」を有効にした状態で IME が動く(ように見える)からといって，B という製品の「保護モード」を有効にしても同じ IME が動く(ように見える)とは限りません．

例えば，Google Chrome は，ウィンドウ処理を行うブラウザプロセスと，HTML のレンダリングを行うレンダラプロセスを分けており，Chrome の保護モードが適応されるのはレンダラプロセスの方です．IME はウィンドウ処理が行われるブラウザプロセスに読み込まれるため，IME にとっての Chrome は，(プラグイン周りをいったん忘れることにすると) メモ帳などと同じただのデスクトップアプリケーションです．

UAC が有効な環境での Microsoft Internet Explorer は，整合性レベルが Low に設定されたプロセスでウィンドウ処理と HTML のレンダリングを実行します．IME はウィンドウ処理を行うプロセス内に読み込まれるので，IME は整合性レベル Low のプロセスで動作する必要があります．なお，Internet Explorer は，整合性レベルを Low にする以外ほとんどプロセスの設定を変更しないため*1，メモ帳を PsExec の -l オプションで起動して IME のテストをするだけでも十分有用です．

Adobe Reader や Firefox 版 Adobe Flash Player の「保護モード」はというと，Chrome の Sandbox 関連の技術を利用していることが Adobe から公表されています*2．Chrome とは異なるのは，ウィンドウ処理を行うプロセス自体を Sandbox 内で実行するということです．このケースでは，IME は Sandbox 化されたプロセス内で動作することになります．

アプリケーション	IME の動作環境
Google Chrome	メモ帳等と変わらず
Microsoft Internet Explorer	整合性レベル Low
Adobe Reader X	Sandboxed process

なお，ブラウザプラグイン特有の IME 関連の問題についてに関しては (基本的に) 今回は扱いません．

*1:他には，HKCU のマッピングが行われたり

*2:たとえば http://blogs.adobe.com/asset/tag/sandbox 内にもいくつか言及が見られます

2010-10-31

IWordBreaker とファイル検索

Windows 7 Vista

「『プリキュア』で検索したら『ハートキャッチプリキュア』にマッチしない」という Windows Search の話．

Windows7に深刻なバグを発見したので、警鐘を鳴らすために晒してみます。
再現に使用したOSはWindows7 Home Premium x64です。

バグの再現手順

　
！！！悪用厳禁！！！
　
●１．適当にフォルダを作る 名前は何でもOK

　
●２．作ったフォルダーを開いて、
「ハートキャッチプリキュア」
「ふたりはプリキュア」
「プリキュア」
の３つのフォルダを新規作成する

　

●３．検索窓に「プリキュア」と入力してみる

　

●４．「ハートキャッチプリキュア」が無かったことにされる

ちくしょう！誰がこんなことを！メディーック！！メディーーーーック！！

対処方法

検索窓に「*プリキュア」と入れると全部ヒットするみたい。

でも、XPの頃は「プリキュア」で全部ヒットしてたのでなんか腑に落ちないアレが。

ちなみに検索インデックスの有無は関係ないみたいです。

＃2010/10/30 11:05 追記
VistaやMacOSでも再現するとか。
Windowsの人は、「Everything」を使うと幸せになれるらしいです。

「従来何も考えずにファイル名の部分文字列で検索できていたのものを，どうしてアスタリスクが必要にしちゃったの？」という方向の話のような気もしますが，その辺は置いておいて久しぶりに IWordBreaker とか．
Windows 7 に標準で付いてくる日本語向け IWordBreaker 実装に「ハートキャッチプリキュア」等を食わせてみます．

using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Security;
using Microsoft.Win32;
using WordBreaker;

namespace WordBreakerTest
{
  using HRESULT = System.UInt32;
  public struct HResults
  {
    public const HRESULT S_OK = 0x00000000;
    public const HRESULT S_FALSE = 0x00000001;
    public const HRESULT E_FAIL = 0x80004005;
    public const HRESULT WBREAK_E_END_OF_TEXT = 0x80041780;
    public const HRESULT LANGUAGE_S_LARGE_WORD = 0x00041781;
    public const HRESULT WBREAK_E_QUERY_ONLY = 0x80041782;
    public const HRESULT WBREAK_E_BUFFER_TOO_SMALL = 0x80041783;
    public const HRESULT LANGUAGE_E_DATABASE_NOT_FOUND = 0x80041784;
    public const HRESULT WBREAK_E_INIT_FAILED = 0x80041785;
  }

  public enum WORDREP_BREAK_TYPE
  {
    WORDREP_BREAK_EOW = 0,
    WORDREP_BREAK_EOS = 1,
    WORDREP_BREAK_EOP = 2,
    WORDREP_BREAK_EOC = 3
  }

  [SuppressUnmanagedCodeSecurity]
  [ComImport, Guid("CC907054-C058-101A-B554-08002B33B0E6"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
  public interface IWordSink
  {
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT PutWord(
        uint cwc,
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 0, ArraySubType = UnmanagedType.U2)] char[] pwcInBuf,
        uint cwcSrcLen,
        uint cwcSrcPos);
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT PutAltWord(
        uint cwc,
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 0, ArraySubType = UnmanagedType.U2)] char[] pwcInBuf,
        uint cwcSrcLen,
        uint cwcSrcPos);
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT StartAltPhrase();
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT EndAltPhrase();
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT PutBreak(WORDREP_BREAK_TYPE breakType);
  }

  [SuppressUnmanagedCodeSecurity]
  [ComImport, Guid("CC906FF0-C058-101A-B554-08002B33B0E6"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
  public interface IPhraseSink
  {
    [Obsolete("Not supported.")]
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT PutSmallPhrase(
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1, ArraySubType = UnmanagedType.U2)] char[] pwcNoun,
        uint cwcNoun,
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3, ArraySubType = UnmanagedType.U2)] char[] pwcModifier,
        uint cwcModifier, uint ulAttachmentType);
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT PutPhrase(
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1, ArraySubType = UnmanagedType.U2)] char[] pwcPhrase,
        uint cwcPhrase);
  }

  public class WordSink : IWordSink
  {
    public Action<string, uint, uint> OnWord { get; set; }
    public Action<string, uint, uint> OnAltWord { get; set; }
    public Action<WORDREP_BREAK_TYPE> OnBreak { get; set; }
    #region CWordSink Members
    public HRESULT PutWord(
        uint cwc,
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 0, ArraySubType = UnmanagedType.U2)] char[] pwcInBuf,
        uint cwcSrcLen,
        uint cwcSrcPos)
    {
      if (OnWord != null)
      {
        OnWord(new string(pwcInBuf), cwcSrcLen, cwcSrcPos);
      }
      return HResults.S_OK;
    }
    public HRESULT PutAltWord(
        uint cwc,
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 0, ArraySubType = UnmanagedType.U2)] char[] pwcInBuf,
        uint cwcSrcLen,
        uint cwcSrcPos)
    {
      if (OnAltWord != null)
      {
        OnAltWord(new string(pwcInBuf), cwcSrcLen, cwcSrcPos);
      }
      return HResults.S_OK;
    }
    public HRESULT StartAltPhrase()
    {
      return HResults.S_OK;
    }
    public HRESULT EndAltPhrase()
    {
      return HResults.S_OK;
    }
    public HRESULT PutBreak(WORDREP_BREAK_TYPE breakType)
    {
      if (OnBreak != null)
      {
        OnBreak(breakType);
      }
      return HResults.S_OK;
    }
    #endregion
  }

  public class CPhraseSink : IPhraseSink
  {
    #region CPhraseSink Members
    public HRESULT PutSmallPhrase(
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1, ArraySubType = UnmanagedType.U2)] char[] pwcNoun,
        uint cwcNoun,
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3, ArraySubType = UnmanagedType.U2)] char[] pwcModifier,
        uint cwcModifier,
        uint ulAttachmentType)
    {
      return HResults.S_OK;
    }
    public HRESULT PutPhrase(
        [In][MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1, ArraySubType = UnmanagedType.U2)] char[] pwcPhrase,
        uint cwcPhrase)
    {
      return HResults.S_OK;
    }
    #endregion
  }

  [UnmanagedFunctionPointer(CallingConvention.StdCall)]
  public delegate uint FillTextBufferDelegate(ref TEXT_SOURCE pTextSource);

  [StructLayout(LayoutKind.Sequential)]
  public struct TEXT_SOURCE
  {
    [MarshalAs(UnmanagedType.FunctionPtr)]
    public FillTextBufferDelegate pfnFillTextBuffer;
    [MarshalAs(UnmanagedType.LPWStr)]
    public string awcBuffer;
    public uint iEnd;
    public uint iCur;
  }

  [SuppressUnmanagedCodeSecurity]
  [ComImport, Guid("D53552C8-77E3-101A-B552-08002B33B0E6"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
  public interface IWordBreaker
  {
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT Init(
        [MarshalAs(UnmanagedType.Bool)] bool fQuery,
        uint maxTokenSize, [MarshalAs(UnmanagedType.Bool)] out bool pfLicense);
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT BreakText(
        ref TEXT_SOURCE pTextSource, [MarshalAs(UnmanagedType.Interface)] IWordSink pWordSink,
        [MarshalAs(UnmanagedType.Interface)] IPhraseSink pPhraseSink);
    [PreserveSig, MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
    HRESULT GetLicenseToUse([MarshalAs(UnmanagedType.LPWStr)] out string ppwcsLicense);
  }

  public static class Program
  {
    public static void BreakText(string text, bool forQuery)
    {
      const string kWordBreakerKey =
          @"HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ContentIndex\Language\Japanese_Default";
      var guid = new Guid(Registry.GetValue(kWordBreakerKey, @"WBreakerClass", string.Empty) as string);
      var wordBreakerType = Type.GetTypeFromCLSID(guid);

      // A newer wordbreaker shipped with MS Office 2010.
      // wordBreakerType = Type.GetTypeFromProgID("NLG.Japanese Wordbreaker.4.1");

      var wordBreaker = default(IWordBreaker);
      try
      {
        wordBreaker = Activator.CreateInstance(wordBreakerType) as IWordBreaker;

        var license = true;
        wordBreaker.Init(forQuery, 4096, out license);

        var filler = (FillTextBufferDelegate)((ref TEXT_SOURCE _) => HResults.WBREAK_E_END_OF_TEXT);
        var pTextSource = new TEXT_SOURCE()
        {
          pfnFillTextBuffer = filler,
          awcBuffer = text,
          iCur = 0,
          iEnd = checked((uint)text.Length),
        };

        var dictionary = new Dictionary<WORDREP_BREAK_TYPE, string>
        {
          {WORDREP_BREAK_TYPE.WORDREP_BREAK_EOC, "[EOC]"},
          {WORDREP_BREAK_TYPE.WORDREP_BREAK_EOP, "[EOP]"},
          {WORDREP_BREAK_TYPE.WORDREP_BREAK_EOS, "[EOS]"},
          {WORDREP_BREAK_TYPE.WORDREP_BREAK_EOW, "[EOW]"},
        };

        var words = new List<string>();
        var altWords = new List<string>();
        wordBreaker.BreakText(ref pTextSource, new WordSink
        {
          OnWord = (word, _, __) => words.Add(word),
          OnAltWord = (word, _, __) => altWords.Add(word),
          OnBreak = type => { words.Add(dictionary[type]); altWords.Add(dictionary[type]); },
        }, new CPhraseSink());
        GC.KeepAlive(filler);
        Console.WriteLine("Words: " + string.Join("/", words));
        Console.WriteLine("Alt Words: " + string.Join("/", altWords));
      }
      catch
      {
        if (wordBreaker != null)
        {
          Marshal.ReleaseComObject(wordBreaker);
          wordBreaker = null;
        }
      }
    }

    [MTAThread]
    static void Main(string[] args)
    {
      BreakText("プリキュア", false);
      BreakText("ふたりはプリキュア", false);
      BreakText("ハートキャッチプリキュア", false);
      BreakText("マイコンピューター", false);
      BreakText("情シス", false);
    }
  }
}

Words: プリキュア
Alt Words:
Words: ふたり/は/プリキュア
Alt Words:
Words: ハトキアッチプリキュア
Alt Words: ハートキャッチプリキュア
Words: マイコンピュタ
Alt Words: マイコンピューター
Words: 情/シス
Alt Words:

さすがに "プリキュア" で分割してくれたりはしないようですね．というかそもそも，「欧文地名以外の複合語をカタカナ表記するときは分かち書き」という Microsoft のスタイルガイドが遵守されているのが前提なのか，カタカナの連続は何も考えずにくっつけているだけのような挙動にも見えました．あんまりちゃんと実験してませんが．
ちなみに，SharePoint に付属する WordBreaker では，以下のようにユーザ辞書ファイルを使うことが出来るようです．

Create a custom dictionary for East Asian word breakers (FAST Search Server 2010 for SharePoint)

4. 以下に従い、ファイルを保存します。

場所 "C:\Program Files\Microsoft Office Servers\12.0\Bin"
(日本語ワードブレーカ nlsdata0011.dllが存在する場所)

ファイル名 "Custom0011.lex" (0011 は言語 ID)

文字コード "Unicode"

さらにこの nlsdata0011.dll というファイルですが，手元の Windows 7 Ja 環境では同名のファイルがシステムディレクトリに存在します．試しに %SystemRoot%\System32\Custom0011.lex (と %SystemRoot%\SysWOW64\Custom0011.lex) というファイルを作り，以下の内容を入力し，BOM 付き UTF-16 ファイルで保存してみます．

#CUSTOMER_WB
情シス
プリキュア

改めて最初のコードを実行すると，結果は以下のようになりました．

Words: プリキュア
Alt Words:
Words: ふたり/は/プリキュア
Alt Words:
Words: ハトキアッチプリキュア
Alt Words: ハートキャッチプリキュア
Words: マイコンピュタ
Alt Words: マイコンピューター
Words: 情シス
Alt Words:

少なくとも「情シス」の方は 1 word として認識されるようになりました．また，実行中に Custom0011.lex が読み込まれていることも，Process Monitor のログから確かめられました．
一方，ユーザ辞書に「プリキュア」を追加しても，"ハートキャッチ/プリキュア" と分割されませんでした．これは，以下の SharePoint での事例と同じもののようです．

ワードブレーキング (設定箇所 : サーバー定義ファイル)

こちらは、セミナーの資料では省略していましたが、懇親会でご質問がありましたので記載しておきます (懇親会場でご回答させて頂きました)。

例えば、「ペドロ&カプリシャス」のようなキーワードを検索したい場合、インデックス収集時に、間の記号(アンパサント &)によって、「ペドロ」と「カプリシャス」でキーワードが自動的に区切られます。こうした場合には、カスタムディクショナリー(Custom Dictionary) を設定することで、こうした自動ブレークを阻止し、「ペドロ&カプリシャス」で完全マッチの検索をおこなうことができます。

カスタムディクショナリの設定ファイルを配置する場所は、シソーラスファイルとは異なり、%programfiles%\Microsoft Office Servers\12\bin\CustomLANG.lex です。(日本語の場合は、Custom0011.lex です。) 設定を反映させるには、インデックスの再収集以外に、クエリー時のブレーク箇所も正しく認識させる必要があるため、ファイル編集後は、 Office SharePoint Server Search サービス (osearch) の再起動と、再クロールの双方をおこなってください。

カスタムディクショナリの作成方法については、以下の記事が参考になります。

TechNet : ユーザー辞書を作成する (Office SharePoint Server 2007)
http://technet.microsoft.com/ja-jp/library/cc263242.aspx

実は、懇親会では、「ワードブレークを阻止したい」というご質問ではなく、逆に「ワードを分割して認識させられないか」というものでした。私は、この回答として、「カスタム辞書 (上記の CustomLANG.lex) を編集することで認識させられる可能性があるかもしれない」とお答えしてしまいましたが、すみません、動作を確認してみたところ、本来分割されていないワードを分割して認識させることは不可能でした。(この予測は誤っておりました。申し訳ありません . . .)

発端の話も，「ワードを分割して認識させられないか」の一種だと思いますが，どうも現世代の Microsoft 製 IWordBreaker 実装ではユーザ辞書を使ってもこの問題を回避できなさそうな感じです．次なる手段としては，自分で IWordBreaker を実装して，HKLM\SYSTEM\CurrentControlSet\Control\ContentIndex\Language\Japanese_Default 以下の WBreakerClass を置き換えてしまう，あたりでしょうか．試してはいないので，うまくいくかは分かりませんが．

それらしい話

2010-05-16

C# のコードに x86/x86-64 命令を直接組み込む

.NET

C# で書かれた将棋の思考ルーチンの高速化のため，(Visual C++ 用の) 組み込み関数 _mm_prefetch 的なものを使うべく，ネイティブコードで書かれた DLL と C# で書かれたメインの思考ルーチンを組み合わせてみた，というお話．ふむふむ．

ざっと眺めて C# のみで書けそうだったので，気分転換も兼ねて書いてみました．個人的には単一の(メタ)言語で完結するプロジェクトが好きです．配布するファイルの数が減るのはインストール・アンインストール作業やバージョン管理が楽になります．Visual Studio で複数言語を混在させると，Express Edition の人にビルドしてもらうとき困ったりするというのもがあります．ビルドシステムは単純な方がいいですよ．ほんと．とまあこの辺りが書いてみようと思った主な理由でしょうか．
さて，以下コードが整理されていないので読みにくいですが，基本的には，

VirtualAlloc で領域を確保し，そこに使いたい関数を書き込む
VirtualProtect で保護属性を PAGE_EXECUTE に変更する *1
Marshal.GetDelegateForFunctionPointer に関数の先頭アドレスを渡して .NET デリゲートに変換する

という流れです．
以下のコードは，ak11 さんの記事と同じく，Prefetch128, Prefetch256, cpuid の 3 つを作成し，C# コードから呼び出しています．呼び出される関数の内容は事前に作ったものですが，実行環境によって x86 用と x86-64 用の関数を使い分けています．なお，Itanium 等その他の CPU には対応しておりません．
余談ですが，原理上は生成される関数自体をプログラムで制御してしまうことも可能です．今回はそこまでやっていませんが，もしその手の動的コード生成の世界に挑戦するのであれば，Xbyak が参考になるかと思います．

// Windows applications may or may not be originally written in Objective-C,
// C, C++, or JavaScript as executed by the JScript engine, and not only code
// written in C, C++, and Objective-C but also code written in other languages
// can compile and directly link against the Documented APIs.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.InteropServices;
using System.Security;
using System.Security.Permissions;

namespace Win32
{
  internal enum ProcessorArchitecture : ushort
  {
    PROCESSOR_ARCHITECTURE_AMD64 = 9,
    PROCESSOR_ARCHITECTURE_IA64 = 6,
    PROCESSOR_ARCHITECTURE_INTEL = 0,
    PROCESSOR_ARCHITECTURE_UNKNOWN = 0xffff,
  }
  internal enum ProcessorType : uint
  {
    PROCESSOR_INTEL_386 = 386,
    PROCESSOR_INTEL_486 = 486,
    PROCESSOR_INTEL_PENTIUM = 586,
    PROCESSOR_INTEL_IA64 = 2200,
    PROCESSOR_AMD_X8664 = 8664,
  }
  [Flags]
  internal enum VirtualAllocType : uint
  {
    MEM_COMMIT = 0x1000,
    MEM_RESERVE = 0x2000,
    MEM_RESET = 0x80000,
    MEM_LARGE_PAGES = 0x20000000,
    MEM_PHYSICAL = 0x400000,
    MEM_TOP_DOWN = 0x100000,
    MEM_WRITE_WATCH = 0x200000,
  }
  [Flags]
  internal enum VirtualFreeType : uint
  {
    MEM_DECOMMIT = 0x4000,
    MEM_RELEASE = 0x8000,
  }
  [Flags]
  internal enum MemoryProtectionType : uint
  {
    PAGE_NOACCESS = 0x01,
    PAGE_READONLY = 0x02,
    PAGE_READWRITE = 0x04,
    PAGE_WRITECOPY = 0x08,
    PAGE_EXECUTE = 0x10,
    PAGE_EXECUTE_READ = 0x20,
    PAGE_EXECUTE_READWRITE = 0x40,
    PAGE_EXECUTE_WRITECOPY = 0x80,
    PAGE_GUARD = 0x100,
    PAGE_NOCACHE = 0x200,
    PAGE_WRITECOMBINE = 0x400
  }
  [StructLayout(LayoutKind.Sequential, Pack = 2)]
  internal struct SystemInfo
  {
    public ProcessorArchitecture ProcessorArchitecture;
    ushort Reserved;
    public uint PageSize;
    public IntPtr MinimumApplicationAddress;
    public IntPtr MaximumApplicationAddress;
    public IntPtr ActiveProcessorMask;
    public uint NumberOfProcessors;
    public ProcessorType ProcessorType;
    public uint AllocationGranularity;
    public ushort ProcessorLevel;
    public ushort ProcessorRevision;
  }
  [SecurityPermission(SecurityAction.LinkDemand, UnmanagedCode = true)]
  internal class VirtualAllocRegion : SafeHandle
  {
    private VirtualAllocRegion()
      : base(IntPtr.Zero, true)
    {
    }
    public override bool IsInvalid
    {
      get { return handle == IntPtr.Zero; }
    }
    protected override bool ReleaseHandle()
    {
      return NativeMethods.VirtualFree(handle, (UIntPtr)0, VirtualFreeType.MEM_RELEASE);
    }
  }
  internal static class NativeMethods
  {
    [SuppressUnmanagedCodeSecurityAttribute]
    [DllImport("Kernel32.dll", CharSet = CharSet.Unicode, ExactSpelling = true)]
    extern static public IntPtr GetCurrentProcess();
    [SuppressUnmanagedCodeSecurityAttribute]
    [DllImport("Kernel32.dll", CharSet = CharSet.Unicode, ExactSpelling = true, SetLastError = true)]
    [return: MarshalAs(UnmanagedType.Bool)]
    extern static public bool FlushInstructionCache(IntPtr processHandle, IntPtr address, uint regionSize);
    [SuppressUnmanagedCodeSecurityAttribute]
    [DllImport("Kernel32.dll", CharSet = CharSet.Unicode, ExactSpelling = true, SetLastError = true)]
    extern static public void GetSystemInfo(out SystemInfo info);
    [SuppressUnmanagedCodeSecurityAttribute]
    [DllImport("Kernel32.dll", CharSet = CharSet.Unicode, ExactSpelling = true, SetLastError = true)]
    extern static public VirtualAllocRegion VirtualAlloc(
        IntPtr address, UIntPtr size, VirtualAllocType allocType, MemoryProtectionType protectionType);
    [DllImport("Kernel32.dll", CharSet = CharSet.Unicode, ExactSpelling = true, SetLastError = true)]
    [return: MarshalAs(UnmanagedType.Bool)]
    extern static public bool VirtualFree(IntPtr address, UIntPtr size, VirtualFreeType allocType);
    [DllImport("Kernel32.dll", CharSet = CharSet.Unicode, ExactSpelling = true, SetLastError = true)]
    [return: MarshalAs(UnmanagedType.Bool)]
    extern static public bool VirtualProtect(
        IntPtr address, UIntPtr size, MemoryProtectionType protectionType,
        out MemoryProtectionType oldProtectionType);
  }

  [StructLayout(LayoutKind.Sequential, Pack = 4)]
  public struct CPUInfo
  {
    public uint eax;
    public uint ebx;
    public uint ecx;
    public uint edx;
  }

  public static unsafe class Prefetcher
  {
    [UnmanagedFunctionPointer(CallingConvention.StdCall)]
    [SuppressUnmanagedCodeSecurityAttribute]
    private delegate void PrefetcherDelegate(void* address);
    [UnmanagedFunctionPointer(CallingConvention.StdCall)]
    [SuppressUnmanagedCodeSecurityAttribute]
    private delegate void CPUIDDelegate([In, Out] ref CPUInfo info);

    private static void DummyPrefetcher(void* address) { }
    private static void DummyCPUID(ref CPUInfo info) { }
    private static PrefetcherDelegate prefetch128_ = DummyPrefetcher;
    private static PrefetcherDelegate prefetch256_ = DummyPrefetcher;
    private static CPUIDDelegate cpuid_ = DummyCPUID;

    private readonly static VirtualAllocRegion region_ = null;

    private static TDelegate CreateDelegate<TDelegate>(IntPtr base_address, int offset)
      where TDelegate : class
    {
      var address = IntPtr.Add(base_address, offset);
      return Marshal.GetDelegateForFunctionPointer(address, typeof(TDelegate)) as TDelegate;
    }

    static Prefetcher()
    {
      // Use GetSystemInfo API to determine the processor architecture.
      var system_info = default(SystemInfo);
      NativeMethods.GetSystemInfo(out system_info);
      var supported_architectures = new []
      {
        ProcessorArchitecture.PROCESSOR_ARCHITECTURE_INTEL,
        ProcessorArchitecture.PROCESSOR_ARCHITECTURE_AMD64,
      };
      if (!supported_architectures.Contains(system_info.ProcessorArchitecture))
      {
        // Unsupported architecture.
        return;
      }

      var x86 = new
      {
        CPUID = new byte[]
        {
          // void __declspec(noinline) __stdcall CPUID(CPUInfo* info);
          0x53,                    //  push        ebx
          0x57,                    //  push        edi
          0x8B, 0x7C, 0x24, 0x0C,  //  mov         edi,dword ptr [esp+0Ch]
          0x8B, 0x07,              //  mov         eax,dword ptr [edi]
          0x8B, 0x4F, 0x08,        //  mov         ecx,dword ptr [edi+8]
          0x0F, 0xA2,              //  cpuid
          0x89, 0x07,              //  mov         dword ptr [edi],eax
          0x89, 0x5F, 0x04,        //  mov         dword ptr [edi+4],ebx
          0x89, 0x4F, 0x08,        //  mov         dword ptr [edi+8],ecx
          0x89, 0x57, 0x0C,        //  mov         dword ptr [edi+0Ch],edx
          0x5F,                    //  pop         edi
          0x5B,                    //  pop         ebx
          0xC2, 0x04, 0x00,        //  ret         4
        },
        Prefetch128 = new byte[]
        {
          // void __declspec(noinline) __stdcall Prefetch128(void* ptr);
          0x8B, 0x4C, 0x24, 0x04,  // mov ecx, dword ptr [esp+4]
          0x0F, 0x18, 0x19,        // prefetcht2  [ecx]
          0x0F, 0x18, 0x59, 0x40,  // prefetcht2  [ecx+40h]
          0xC2, 0x04, 0x00,        // ret 4
        },
        Prefetch256 = new byte[]
        {
          // void __declspec(noinline) __stdcall Prefetch256(void* ptr);
          0x8B, 0x4C, 0x24, 0x04,                    // mov ecx, dword ptr [esp+4]
          0x0F, 0x18, 0x19,                          // prefetcht2  [ecx]
          0x0F, 0x18, 0x59, 0x40,                    // prefetcht2  [ecx+40h]
          0x0F, 0x18, 0x99, 0x80, 0x00, 0x00, 0x00,  // prefetcht2  [ecx+80h]
          0x0F, 0x18, 0x99, 0xC0, 0x00, 0x00, 0x00,  // prefetcht2  [ecx+0C0h]
          0xC2, 0x04, 0x00,                          // ret 4
        },
      };

      var x64 = new
      {
        CPUID = new byte[]
        {
          // void __declspec(noinline) CPUID(CPUInfo* info);
          0x4C, 0x8B, 0xCB,        // mov         r9,rbx
          0x4C, 0x8B, 0xC1,        // mov         r8,rcx
          0x41, 0x8B, 0x00,        // mov         eax,dword ptr [r8]
          0x41, 0x8B, 0x48, 0x08,  // mov         ecx,dword ptr [r8+8]
          0x0F, 0xA2,              // cpuid
          0x41, 0x89, 0x00,        // mov         dword ptr [r8],eax
          0x41, 0x89, 0x58, 0x04,  // mov         dword ptr [r8+4],ebx
          0x41, 0x89, 0x48, 0x08,  // mov         dword ptr [r8+8],ecx
          0x41, 0x89, 0x50, 0x0C,  // mov         dword ptr [r8+0Ch],edx
          0x4C, 0x89, 0xCB,        // mov         rbx,r9
          0xC3,                    // ret
        },
        Prefetch128 = new byte[]
        {
          // void __declspec(noinline) Prefetch128(void* ptr);
          0x0F, 0x18, 0x19,        // prefetcht2  [rcx]
          0x0F, 0x18, 0x59, 0x40,  // prefetcht2  [rcx+40h]
          0xC3,                    // ret
        },
        Prefetch256 = new byte[]
        {
          // void __declspec(noinline) Prefetch256(void* ptr);
          0x0F, 0x18, 0x19,                          // prefetcht2  [rcx]
          0x0F, 0x18, 0x59, 0x40,                    // prefetcht2  [rcx+40h]
          0x0F, 0x18, 0x99, 0x80, 0x00, 0x00, 0x00,  // prefetcht2  [rcx+80h]
          0x0F, 0x18, 0x99, 0xC0, 0x00, 0x00, 0x00,  // prefetcht2  [rcx+0C0h]
          0xC3,                                      // ret
        },
      };

      var target = Environment.Is64BitProcess ? x64 : x86;

      // Align 8-byte boundary with the specified padding data.
      var align8 = (Func<byte[], byte, byte[]>)(
        (array, paddingData) => array.Concat(Enumerable.Repeat(paddingData, int.MaxValue))
                                     .Take((array.Length + 7) & ~7).ToArray());

      const byte int3 = 0xcc;
      var cpuid = align8(target.CPUID, int3);
      var prefetch128 = align8(target.Prefetch128, int3);
      var prefetch256 = align8(target.Prefetch256, int3);

      var data = prefetch128.Concat(prefetch256)
                            .Concat(cpuid)
                            .ToArray();
      var offset = new
      {
        Prefetch128 = 0,
        Prefetch256 = prefetch128.Length,
        CPUID = prefetch128.Length + prefetch256.Length,
      };

      try
      {
        region_ = NativeMethods.VirtualAlloc(
            IntPtr.Zero, (UIntPtr)data.Length, VirtualAllocType.MEM_COMMIT,
            MemoryProtectionType.PAGE_READWRITE);
        if (region_.IsInvalid)
        {
          return;
        }

        var addr = region_.DangerousGetHandle();
        Marshal.Copy(data, 0, addr, data.Length);
        var oldType = default(MemoryProtectionType);
        var succeeded = NativeMethods.VirtualProtect(
            addr, (UIntPtr)data.Length, MemoryProtectionType.PAGE_EXECUTE, out oldType);
        if (!succeeded)
        {
          GlobalDispose();
          return;
        }

        // GetCurrentProcess returns a pseudo handle.
        // You need not to free a pseudo handle by ClodeHandle.
        var pseudoHandle = NativeMethods.GetCurrentProcess();
        succeeded = NativeMethods.FlushInstructionCache(pseudoHandle, addr, (uint)data.Length);
        if (!succeeded)
        {
          GlobalDispose();
          return;
        }

        prefetch128_ = CreateDelegate<PrefetcherDelegate>(addr, offset.Prefetch128);
        prefetch256_ = CreateDelegate<PrefetcherDelegate>(addr, offset.Prefetch256);
        cpuid_ = CreateDelegate<CPUIDDelegate>(addr, offset.CPUID);
      }
      catch
      {
        GlobalDispose();
        throw;
      }
    }
    public static CPUInfo CPUID(uint type)
    {
      return CPUID(type, 0);
    }
    public static CPUInfo CPUID(uint type, uint sub_type)
    {
      var info = default(CPUInfo);
      info.eax = type;
      info.ecx = sub_type;
      cpuid_(ref info);
      return info;
    }
    public static void Prefetch128(void* address)
    {
      prefetch128_(address);
    }
    public static void Prefetch256(void* address)
    {
      prefetch256_(address);
    }
    // This method is not thread-safe.
    public static void GlobalDispose()
    {
      cpuid_ = DummyCPUID;
      prefetch128_ = DummyPrefetcher;
      prefetch256_ = DummyPrefetcher;
      if (region_ != null && !region_.IsInvalid) { region_.Dispose(); }
    }
  }
}

static class Program
{
  static unsafe void Main(string[] args)
  {
    var info = Win32.Prefetcher.CPUID(0);
    if (info.eax < 1)
    {
      return;
    }

    info = Win32.Prefetcher.CPUID(1);
    var HasMMX = (info.edx & (1 << 23)) != 0;
    var HasSSE = (info.edx & (1 << 25)) != 0;
    var HasSSE2 = (info.edx & (1 << 26)) != 0;
    var HasSSE3 = (info.ecx & (1 << 0)) != 0;

    var buffer = new byte[1024];
    fixed (byte* ptr = buffer)
    {
      Win32.Prefetcher.Prefetch128(ptr);
      Win32.Prefetcher.Prefetch256(ptr);
    }
    Win32.Prefetcher.GlobalDispose();
  }
}

*1:[http://msdn.microsoft.com/en-us/library/aa366599.aspx:title=HeapCreate] + HEAP_CREATE_ENABLE_EXECUTE でも良かったのですが，最終的に書き込み可能属性を落としたかったので今回は VirtualAlloc + VirtualProtect を使いました

2010-03-29

数式入力パネルとアプリケーションを連携させる 2 つの方法

Windows 7 .NET PowerShell

Windows 7 では，タブレット PC 向け機能が強化され，数式の手書き入力がサポートされるようになりました．この機能とアプリケーションを連携させるための方法を 2 つほど紹介します．

数式入力パネルからのデータをクリップボード経由で受け取る

『数式入力パネル』は，単体アプリケーションとして動作する数式入力ツールです．このツールは，いわゆるソフトウェアキーボードや文字パレットのように動作し，入力フォーカスを持つアプリケーションに数式情報を送り込みます．
実際には，この機能はクリップボードを利用して実現されています．
『数式入力パネル』は，挿入ボタンが押されると，数式を UTF-8 でエンコードされた MathML 形式でクリップボードに格納し，Ctrl+V のキーボードイベントを発生させます．このとき，入力フォーカスを持つアプリケーションが，Ctrl+V で貼り付け動作を行い，かつクリップボードに格納された "MathML Presentation" 形式または "MathML" 形式のデータを解釈できることが，『数式入力パネル』と連携するための条件です．

参考

数式入力パネル (Math Input Panel、MIP) は、Tablet PC のタブレットとペンを使用するように設計されています。ただし、タッチスクリーン、外部デジタイザー、あるいはマウスなどの任意の入力デバイスでも使用可能です。MIP は、クリップボードを介して、標準化された数学的なマークアップ言語である MathML フォーマットで認識結果を出力します。MIP で手書きされて認識された数式は、完全に編集可能な形式でレプリケート先アプリケーションに出力されます。テキストを編集したいときは、出力に対して挿入したり編集したりできます。

注

数式入力パネルでは、数学用マークアップ言語 (MathML) をサポートするプログラムにのみ数式を挿入できます。

数式入力パネルの入力ウィンドウをインプロセスで利用する

COM のインプロセスサーバとして，数式入力パネルの入力ウィンドウを利用することも可能です．"%CommonProgramFiles%\Microsoft Shared\ink\micaut.dll" 内に格納された TypeLib から、必要な情報を得ることが出来るでしょう．
参考までに，C# で COM Interop を行ってみたサンプルを置いておきます．

MathInputConsole.zip
- Windows 7 以降専用

このサンプルは，実行すると数式入力パネルの入力ウィンドウとコンソールウィンドウが表示され，入力ウィンドウの「挿入」を押したときに表示されていた数式が MathML 形式でコンソールに表示されます．

参考

Programming the Math Input Control - MSDN Library

余談: 近年の Microsoft 製品と MathML とクリップボード

近年の Microsoft 製品では MathML の利用が増えています．
Microsoft Word 2007 以降の Microsoft Word や，Microsoft PowerPoint 2010 は，MathML 形式で数式を貼り付けることが可能です．ただし，クリップボードのデータ形式については注意が必要でした．手元の Microsoft Office 2010 Beta で試してみたところ，以下のような挙動の違いがありました．

Microsoft Word 2010
- データ形式が "Text" または "Unicode Text" であっても，内容が MathML であれば数式として挿入される．
- データ形式が "MathML Presentation"，"MathML" いずれの場合も貼り付け可能
Microsoft PowerPoint 2010
- データ形式が "Text" または "Unicode Text" の場合，内容が MathML であってもただのテキストとして解釈される．
- データ形式が "MathML" の場合，MathML として解釈する．"MathML Presentation" はサポートしない．

実験用の PowerShell スクリプトを以下に示します．

# STA モードで PowerShell を起動する (Windows.Forms.Clipboard のため)
powershell.exe -sta
$null = [Reflection.Assembly]::LoadWithPartialName("System.Windows.Forms")

# E = mc^2
$mathml = '<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mi>m</mml:mi><mml:msup><mml:mi>c</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:math>'

# UTF-8 で MemoryStream に格納
$ms = New-Object System.IO.MemoryStream(,[System.Text.Encoding]::UTF8.GetBytes($mathml))

# "MathML" という形式名でクリップボードに格納
[Windows.Forms.Clipboard]::SetData("MathML", $ms)

このスクリプトを実行した後で，Microsoft Word 2010 または Microsoft PowerPoint 2010 に貼り付けを行ってみて下さい．数式が貼り付けられるはずです．
これを利用すると，MathML 形式で数式を出力できるソフトウェア，例えば Mathematica から出力した数式を，構造を保ったまま PowerPoint に貼り付ける，といったことが可能です．
また，Microsoft Word 2010 や Microsoft PowerPoint 2010 からクリップボードにコピーする際に，数式を MathML 形式でコピーするように設定することも可能です．この設定を有効にすれば，PowerPoint に書かれていた数式を Mathematica にコピーし，その場で計算を行う，といった使い道も可能です．

参考

2010-03-08

MSBuild, 環境変数, Property Functions，ビルド時計算

.NET

Visual StudioからBuildしたときに環境変数が取得できない…

MSBuild Extension PackにEnvironmentVariableタスクがあるけれど、標準でついてないなんてありえません。プロパティ式に組み込んでもいいくらいなのに。
$(env:DXSDK_DIR)
みたいな感じで

んー，手元の Visual Studio 2008 でも Visual Studio 2010 でも，$(DXSDK_DIR) で環境変数を取得できているような……
$(PROCESSOR_REVISION) とか，$(OS) とか色々試してみましたが，こちらも特に問題なく取得できました．

というわけで，そもそも何が問題なのか今ひとつよく分からないのですが，MSBuild 4.0 以降では，Property Functions を使って明示的に書くこともできます．

$([System.Environment]::GetEnvironmentVariable("DXSDK_DIR"))

こんな感じで，いくつかの事前定義された .NET クラスライブラリを使用できるようになるわけですね．例えば System.Math を呼び出してビルド時に三角関数を計算，なんてのも可能です．

参考

2010-01-30

ローカルストレージに保存するデータの暗号化 ― Windows の場合

Gumblar による FFFTP への攻撃について

GumblarによるFFFTPへの攻撃について

FTPのアカウントを盗み、サイトを改竄するGumblarウイルスが猛威をふるっております。

このGumblarウイルスの亜種が、FFFTPを狙って攻撃していることが報告されております。詳しくは以下のサイトを参照してください。

smilebanana
UnderForge of Lack

FFFTPはパスワードをレジストリに記録しております。簡単な暗号化をかけてありますが、FFFTPはオープンソースであるため、暗号の解除法はプログラムソースを解析すれば可能です。

Gumblarウイルスの亜種は、レジストリに記録されているパスワードを読み取り、サイト改竄に使用しているようです。

上記理由により、以下のいずれかの対策をお取りください。

●接続先のFTPサーバーがSSL等に対応している場合。

→SSL対応のFTPソフトへの切り替えをお薦めします。現在、FFFTPはSSL等に対応していません。なお、切り替えの際は、コントロールパネルの「プログラムの追加と削除」を使って、FFFTPをアンインストールしてください。

●接続先のFTPサーバーがSSL等に対応していない場合。

→パスワードをFFFTPに記憶させるのをやめ、接続時に毎回パスワードを入力するようにしてください。ただし、Gumblarウイルスは通信の傍受も行っていると考えられるため、FTPサーバーに接続した時点で、Gumblarウイルスにパスワードが盗まれる可能性があり、万全ではありません。

なお、UnderForge of Lackに記載されているレジストリの削除ですが、通常はFFFTPをコントロールパネルを使ってアンインストールした段階で削除されます。

とあったので，FFFTP のソースを読んでみました．

/*----- パスワードを暗号化する ------------------------------------------------
*
*	Parameter
*		char *Str : パスワード
*		kchar *Buf : 暗号化したパスワードを格納するバッファ
*
*	Return Value
*		なし
*----------------------------------------------------------------------------*/

static void EncodePassword(char *Str, char *Buf)
{
	unsigned char *Get;
	unsigned char *Put;
	int Rnd;
	int Ch;

	srand((unsigned)time(NULL));

	Get = (unsigned char *)Str;
	Put = (unsigned char *)Buf;
	while(*Get != NUL)
	{
		Rnd = rand() % 3;
		Ch = ((int)*Get++) << Rnd;
		Ch = (unsigned char)Ch | (unsigned char)(Ch >> 8);
		*Put++ = 0x40 | ((Rnd & 0x3) << 4) | (Ch & 0xF);
		*Put++ = 0x40 | ((Ch >> 4) & 0xF);
		if((*(Put-2) & 0x1) != 0)
			*Put++ = (rand() % 62) + 0x40;
	}
	*Put = NUL;
	return;
}


/*----- パスワードの暗号化を解く ----------------------------------------------
*
*	Parameter
*		char *Str : 暗号化したパスワード
*		kchar *Buf : パスワードを格納するバッファ
*
*	Return Value
*		なし
*----------------------------------------------------------------------------*/

static void DecodePassword(char *Str, char *Buf)
{
	unsigned char *Get;
	unsigned char *Put;
	int Rnd;
	int Ch;

	Get = (unsigned char *)Str;
	Put = (unsigned char *)Buf;
	while(*Get != NUL)
	{
		Rnd = ((unsigned int)*Get >> 4) & 0x3;
		Ch = (*Get & 0xF) | ((*(Get+1) & 0xF) << 4);
		Ch <<= 8;
		if((*Get & 0x1) != 0)
			Get++;
		Get += 2;
		Ch >>= Rnd;
		Ch = (Ch & 0xFF) | ((Ch >> 8) & 0xFF);
		*Put++ = Ch;
	}
	*Put = NUL;
	return;
}

今回問題となっている接続パスワード等は，上記 EncodePassword 関数でエンコードされてレジストリに記録されていたようです．

CryptProtectData API

最初に注意書き．

私はセキュリティの専門家ではありませんので，下記の内容を信用する前に社内外のセキュリティ専門家の方によく相談されることをおすすめします．

さて今回の件，「オープンソースなプロダクトの場合，ソースを読めばデコード方法が分かる」という点がやや気になるような気にならないような感じです．
まず出発点として，Microsoft が推奨している方法を見てみましょう．

Storing Passwords

Never store passwords in plaintext (unencrypted). Encrypting passwords significantly increases their security. For information about storing encrypted passwords, see CryptProtectData. For information about encrypting passwords in memory, see CryptProtectMemory. Store passwords in as few places as possible. The more places a password is stored, the greater the chance that an intruder might find it. Never store passwords in a Web page or in a Web-based file. Storing passwords in a Web page or in a Web-based file allows them to be easily compromised.

After you have encrypted a password and stored it, use secure ACLs to limit access to the file. Alternatively, you can store passwords and encryption keys on removable devices. Storing passwords and encryption keys on a removable media, such as a smart card, helps create a more secure system. After a password is retrieved for a given session, the card can be removed, thereby removing the possibility that an intruder can gain access to it.

上記文章にあるように，CryptProtectData API や CryptProtectMemory API *1 を使ってデータを暗号化した上で，保存先のアクセスコントロールリストにも気をつけろ，とあります．
実際，CryptProtectData でパスワード等を暗号化しているオープンソースなプロダクトに，Chromium (Google Chrome) があります．

// Copyright (c) 2006-2008 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

#include "chrome/browser/password_manager/encryptor.h"

#include <windows.h>
#include <wincrypt.h>
#include "base/string_util.h"

#pragma comment(lib, "crypt32.lib")

bool Encryptor::EncryptWideString(const std::wstring& plaintext,
                                  std::string* ciphertext) {
  return EncryptString(WideToUTF8(plaintext), ciphertext);
}

bool Encryptor::DecryptWideString(const std::string& ciphertext,
                                  std::wstring* plaintext){
  std::string utf8;
  if (!DecryptString(ciphertext, &utf8))
    return false;

  *plaintext = UTF8ToWide(utf8);
  return true;
}

bool Encryptor::EncryptString(const std::string& plaintext,
                              std::string* ciphertext) {
  DATA_BLOB input;
  input.pbData = const_cast<BYTE*>(
    reinterpret_cast<const BYTE*>(plaintext.data()));
  input.cbData = static_cast<DWORD>(plaintext.length());

  DATA_BLOB output;
  BOOL result = CryptProtectData(&input, L"", NULL, NULL, NULL,
                                 0, &output);
  if (!result)
    return false;

  // this does a copy
  ciphertext->assign(reinterpret_cast<std::string::value_type*>(output.pbData),
                     output.cbData);

  LocalFree(output.pbData);
  return true;
}

bool Encryptor::DecryptString(const std::string& ciphertext,
                              std::string* plaintext){
  DATA_BLOB input;
  input.pbData = const_cast<BYTE*>(
    reinterpret_cast<const BYTE*>(ciphertext.data()));
  input.cbData = static_cast<DWORD>(ciphertext.length());

  DATA_BLOB output;
  BOOL result = CryptUnprotectData(&input, NULL, NULL, NULL, NULL,
                                   0, &output);
  if(!result)
    return false;

  plaintext->assign(reinterpret_cast<char*>(output.pbData), output.cbData);
  LocalFree(output.pbData);
  return true;
}

Chromium Revision 8066 では，CryptProtectData API の pOptionalEntropy 引数および pPromptStruct 引数に NULL を渡しています．これは，同じコンピュータの同じユーザであれば，誰でも CryptUnprotectData API で復号できることを意味します．
同じコンピュータを使う別のユーザから復号は，(暗号化が行われた PC 環境の暗号化設定で期待される程度に) 防がれます．暗号化されたデータ列が流出した場合にも，復号は (暗号化が行われた PC 環境の暗号化設定で期待される程度に) 防がれます．
例として，以下のバイト列を Chromium と同じ方法で暗号化してみました．

const unsigned char original_password[] = {
  0x6b, 0x6f, 0x67, 0x61, 0x69, 0x64, 0x61, 0x6e,
};

手元に構築した仮想環境の Windows XP SP3 では，上記バイト列から以下のようなバイト列が生成されました．なお，同じ入力データであっても毎回異なる結果が返されますが，どの出力に対しても復号結果は同じになります．

const unsigned char encrypted_password[] = {
  0x01, 0x00, 0x00, 0x00, 0xd0, 0x8c, 0x9d, 0xdf,
  0x01, 0x15, 0xd1, 0x11, 0x8c, 0x7a, 0x00, 0xc0,
  0x4f, 0xc2, 0x97, 0xeb, 0x01, 0x00, 0x00, 0x00,
  0x22, 0xd6, 0xf5, 0x5e, 0x47, 0x15, 0xa1, 0x4d,
  0x97, 0xde, 0x34, 0xbf, 0xc8, 0xb9, 0x4c, 0x9c,
  0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x03, 0x66, 0x00, 0x00, 0xa8, 0x00,
  0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0xf6, 0xbb,
  0xf7, 0x64, 0xd3, 0xe0, 0x27, 0x58, 0xcf, 0xd0,
  0xf1, 0xab, 0x21, 0x3f, 0x6b, 0xf8, 0x00, 0x00,
  0x00, 0x00, 0x04, 0x80, 0x00, 0x00, 0xa0, 0x00,
  0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x98, 0x84,
  0x42, 0xa8, 0x82, 0xff, 0x44, 0xc2, 0x44, 0xeb,
  0xaa, 0xc8, 0x84, 0xd2, 0x0d, 0x18, 0x10, 0x00,
  0x00, 0x00, 0x56, 0x63, 0x8c, 0x93, 0x17, 0x8e,
  0xe0, 0x7d, 0x38, 0x77, 0x6f, 0xe1, 0xda, 0x64,
  0x85, 0xdc, 0x14, 0x00, 0x00, 0x00, 0x31, 0xd4,
  0xab, 0x9c, 0xeb, 0xc3, 0x17, 0x62, 0xa6, 0xcd,
  0xcc, 0x1c, 0x0e, 0x35, 0xfa, 0x13, 0x52, 0x0e,
  0x00, 0x3a, };

さてこの暗号化，同一ユーザーのプロセスからは簡単に復号できてしまうということで，何らかのアプリケーションが陥落した時点であまり意味がありません*2．とはいえ，Windows で同一ユーザーのプロセスからの攻撃に対処するのは大変です．これはもはや OS の領分，しかも歴史の長い OS ということで，改善しようにも互換性への影響が大きく，Integrity Level や UAC の導入で見たような大きな混乱と長期の努力がつきまといます．
Windows 環境での開発経験者向けに 3 行でまとめると，つまりはこうです．

仮に，Integrity Level Low な対話型プロセス (ブラウザやそのプラグイン) に任意のコードを実行可能な脆弱性が存在するとして，
そのプロセスでは，レジストリやファイルからデータを読み出すための API と CryptUnprotectData API が実行可能とする
さて，レジストリやファイルから個人情報 (のように保管されている機密データ) を盗み出されないようにするには？

こういう要求の実現は，(今の Windows では) アプリケーションと言うより OS というより，アンチウィルスソフトの領分のようにも感じられます．
今日も現実しんどいです…*3

実験に使ったサンプルコード

#include <iostream>

#include <windows.h>
#include <wincrypt.h>
#include <iostream>

#pragma comment(lib, "crypt32.lib")

bool EncryptString(const std::string& plaintext,
                   std::string* ciphertext) {
  DATA_BLOB input;
  input.pbData = const_cast<BYTE*>(
    reinterpret_cast<const BYTE*>(plaintext.data()));
  input.cbData = static_cast<DWORD>(plaintext.length());

  DATA_BLOB output;
  BOOL result = CryptProtectData(
    &input, L"", NULL, NULL, NULL,
    CRYPTPROTECT_UI_FORBIDDEN, // remove if you want to use password
    &output);
  if (!result)
    return false;

  // this does a copy
  ciphertext->assign(reinterpret_cast<std::string::value_type*>(output.pbData),
                     output.cbData);
  LocalFree(output.pbData);
  return true;
}

bool DecryptString(const std::string& ciphertext,
                   std::string* plaintext){
  DATA_BLOB input;
  input.pbData = const_cast<BYTE*>(
    reinterpret_cast<const BYTE*>(ciphertext.data()));
  input.cbData = static_cast<DWORD>(ciphertext.length());

  DATA_BLOB output;
  BOOL result = CryptUnprotectData(
    &input, NULL, NULL, NULL, NULL,
    CRYPTPROTECT_UI_FORBIDDEN, // remove if you want to use password
    &output);
  if(!result)
    return false;

  plaintext->assign(reinterpret_cast<char*>(output.pbData), output.cbData);
  LocalFree(output.pbData);
  return true;
}

void dump_string(const std::string& label, const std::string& test);

int main() {
  std::string original_test = "kogaidan";
  dump_string("original_password", original_test);

  std::string encrypted_text;
  if (!EncryptString(original_test, &encrypted_text)) {
    std::cerr << "EncryptString failed" << std::endl;
    return 1;
  }

  dump_string("encrypted_password", encrypted_text);

  std::string derypted_text;
  if (!DecryptString(encrypted_text, &derypted_text)) {
    std::cerr << "DecryptString failed" << std::endl;
    return 1;
  }
  dump_string("decrypted_password", derypted_text);

  return 0;
}

#pragma region dump_string 
void dump_string(const std::string& label, const std::string& test) {
  std::cout << "const unsigned char " << label.c_str() << "[] = {\n";
  std::cout << std::hex;
  int count = 0;
  for (std::string::const_iterator i = test.begin(); i != test.end(); ++i) {
    if (count++ == 0) { std::cout << "  "; }
    std::cout << "0x";
    std::cout.width(2);
    std::cout.fill('0');
    std::cout << (*i & 0xff) << ", ";
    if (count >= 8) { std::cout << "\n"; count = 0; }
  }
  std::cout << std::dec;
  std::cout << "};\n";
  std::cout << std::endl;
}
#pragma endregion

参考

DPAPI / DPAPIによる暗号化 - EternalWindows
- CryptProtectData の使用方法について解説されています．プロンプトを表示してパスワードを併用する方法もあわせて紹介されています．
http://www.forest.impress.co.jp/docs/news/20100130_346056.html:title=
- FFFTP が保存するデータが狙われている件について色々

参考2

CryptProtectMemory API は Windows Vista 以降で利用可能*4で，今回のログオンセッションのみ復号可能，といった期限付きの暗号化を行うことができます．

pDataは、暗号化したいデータを指定します。 cbDataは、pbDataのサイズを指定します。この値は、CRYPTPROTECTMEMORY_BLOCK_SIZE定数の倍数でなければなりません。 dwFlagsは、次に示す定数のいずれかを指定します。

定数説明

CRYPTPROTECTMEMORY_SAME_PROCESS 暗号化を行ったプロセスだけがデータを復号化できる。プロセスが終了するとデ−タを複合化することはできない。

CRYPTPROTECTMEMORY_CROSS_PROCESS 暗号化を行ったプロセスだけでなく、別のプロセスもデータを複合化できる。システムをシャットダウンするとデ−タを複合化することはできない。

CRYPTPROTECTMEMORY_SAME_LOGON 暗号化を行ったプロセスだけでなく、別のプロセスもデータを複合化できる。ただし、そのプロセスは暗号化を行ったプロセスと同じログオンセッションで動作している必要がある。システムをシャットダウンするとデ−タを複合化することはできない。

定数	説明
CRYPTPROTECTMEMORY_SAME_PROCESS	暗号化を行ったプロセスだけがデータを復号化できる。プロセスが終了するとデ−タを複合化することはできない。
CRYPTPROTECTMEMORY_CROSS_PROCESS	暗号化を行ったプロセスだけでなく、別のプロセスもデータを複合化できる。システムをシャットダウンするとデ−タを複合化することはできない。
CRYPTPROTECTMEMORY_SAME_LOGON	暗号化を行ったプロセスだけでなく、別のプロセスもデータを複合化できる。ただし、そのプロセスは暗号化を行ったプロセスと同じログオンセッションで動作している必要がある。システムをシャットダウンするとデ−タを複合化することはできない。

参考3

CredUIPromptForCredentials (Vista 以降は CredUIPromptForWindowsCredentials推奨) も，パスワード管理に使えそうですが，ユーザー単位に秘密情報を格納するため，同一ユーザーのプロセスがどれかひとつ陥落した時点でやばげに見えます．

はじめに

アプリケーションが、データベースや FTP サイトなど保護されたリソースにアクセスするために、ユーザー提供の資格情報が必要な場合があります。しかし、ユーザーの ID とパスワードを取得し格納することは、システムにとってセキュリティ上のリスクにもなります。可能であれば、ユーザーが資格情報を提供しないようにする必要がありますが (たとえば、データベース用に統合された認証を使用するなど)、それは避けられない場合もあります。ユーザーからの資格情報のリクエストが必要で、アプリケーションは、Microsoft® Windows® XP または Microsoft® Windows Server 2003 上で実行している場合、オペレーティングシステムはこのタスクを容易にする関数を提供します。

Stored User Names and Passwords

Windows XP と Windows Server 2003 は、「Stored User Names and Passwords」と呼ばれる機能 (図 1 を参照してください) を使用して、1 つの Windows ユーザーアカウントに 1 セットの資格情報を関連付け、Data Protection API (DPAPI) を使用して、それらの資格情報を格納します。

図 1. Windows XP の [Credential Management] ダイアログボックス

アプリケーションが Windows XP または Windows .NET 上で実行している場合、アプリケーションは、資格情報管理 API 機能を使用して、ユーザーに資格情報を確認します。これらの API の使用によって、一貫したユーザーインターフェイス (図 2 を参照してください) が提供され、オペレーティングシステムによるこれらの資格情報のキャッシュを自動的にサポートします。

図 2. 標準の Windows XP の資格情報ダイアログボックス

ユーザーの資格情報をアプリケーションで、リクエスト、格納、使用することに関する問題は、Michael Howard and David LeBlanc による『プログラマのためのセキュリティ対策テクニック』でさらに詳しく説明されています。詳細情報については、その本を読むことをお勧めします。ここでは、Microsoft® Visual Basic® .NET と C# アプリケーションからの資格情報管理 API の使用方法を示します。

またも EternalWindows さんの記事を参考に．

Credentials Management - EternalWindows

*1:これらの API は Windows 2000 以降でのみ利用可能です

*2:仮に pOptionalEntropy 引数を併用したとしても今度は pOptionalEntropy の内容をどうやって隠すかという問題になります．pOptionalEntropy の内容をソースコードに書いてしまうのは，パスワードをソースコードに書いてしまうことと同じです．

*3:[http://niha28.sakura.ne.jp/b/log/100:title=元ネタ]

*4:Windows 2000 SP3 以降に関しては [http://msdn.microsoft.com/en-us/library/aa387693.aspx:title=RtlEncryptMemory]

2010-01-22

書籍紹介: CLR via C#, Third Edition

.NET Book

CLR via C#, Third Edition

作者: Jeffrey Richter
出版社/メーカー: Microsoft Press
発売日: 2010/02/10
メディア: ペーパーバック
購入: 1人クリック: 50回
この商品を含むブログ (4件) を見る

第3版の季節がやって参りました．
そもそもそんな本は知らんがな，という人ももしかしたらいらっしゃるかもしれませんが，『プログラミングMicrosoft .NET Framework』の原書と言えば多くの .NET 開発者には通じるのではないかと思います*1．
さて，今回の更新の目玉のひとつが，.NET Framework 4 / C# 4.0 対応というところらしいですが，まあぶっちゃけ第2版を持っている人はいつもの定期更新てところでしょうか．私も，細かい更新とかはとりあえず手元に届いてから調べるつもりです．
なお，著者のブログで更新内容については告知されています．また記事のコメントでのやりとりでは，「eBook 版の予定もある」と書かれています*2．用心深い方や，邦訳版でとりあえず間に合っている人で，発売日に買うかどうか迷っている方は，その辺りが参考になるでしょう．

Part I – CLR Basics

Chapter 1-The CLR’s Execution Model

Added about discussion about C#’s /optimize and /debug switches and how they relate to each other.

Chapter 2-Building, Packaging, Deploying, and Administering Applications and Types

Improved discussion about Win32 manifest information and version resource information.

Chapter 3-Shared Assemblies and Strongly Named Assemblies

Added discussion of TypeForwardedToAttribute and TypeForwardedFromAttribute.

Part II – Designing Types

Chapter 4-Type Fundamentals

No new topics.

Chapter 5-Primitive, Reference, and Value Types

Enhanced discussion of checked and unchecked code and added discussion of new BigInteger type. Also added discussion of C# 4.0’s dynamic primitive type.

Chapter 6-Type and Member Basics

No new topics.

Chapter 7-Constants and Fields

No new topics.

Chapter 8-Methods

Added discussion of extension methods and partial methods.

Chapter 9-Parameters

Added discussion of optional/named parameters and implicitly-typed local variables.

Chapter 10-Properties

Added discussion of automatically-implemented properties, properties and the Visual Studio debugger, object and collection initializers, anonymous types, the System.Tuple type and the ExpandoObject type.

Chapter 11-Events

Added discussion of events and thread-safety as well as showing a cool extension method to simplify the raising of an event.

Chapter 12-Generics

Added discussion of delegate and interface generic type argument variance.

Chapter 13-Interfaces

No new topics.

Part III – Essential Types

Chapter 14-Chars, Strings, and Working with Text

No new topics.

Chapter 15-Enums

Added coverage of new Enum and Type methods to access enumerated type instances.

Chapter 16-Arrays

Added new section on initializing array elements.

Chapter 17-Delegates

Added discussion of using generic delegates to avoid defining new delegate types. Also added discussion of lambda expressions.

Chapter 18-Attributes

No new topics.

Chapter 19-Nullable Value Types

Added discussion on performance.

Part IV – CLR Facilities

Chapter 20-Exception Handling and State Management

This chapter has been completely rewritten. It is now about exception handling and state management. It includes discussions of code contracts and constrained execution regions (CERs). It also includes a new section on trade-offs between writing productive code and reliable code.

Chapter 21-Automatic Memory Management

Added discussion of C#’s fixed state and how it works to pin objects in the heap. Rewrote the code for weak delegates so you can use them with any class that exposes an event (the class doesn’t have to support weak delegates itself). Added discussion on the new ConditionalWeakTable class, GC Collection modes, Full GC notifications, garbage collection modes and latency modes. I also include a new sample showing how your application can receive notifications whenever Generation 0 or 2 collections occur.

Chapter 22-CLR Hosting and AppDomains

Added discussion of side-by-side support allowing multiple CLRs to be loaded in a single process. Added section on the performance of using MarshalByRefObject-derived types. Substantially rewrote the section on cross-AppDomain communication. Added section on AppDomain Monitoring and first chance exception notifications. Updated the section on the AppDomainManager class.

Chapter 23-Assembly Loading and Reflection

Added section on how to deploy a single file with dependent assemblies embedded inside it. Added section comparing reflection invoke vs bind/invoke vs bind/create delegate/invoke vs C#’s dynamic type.

Chapter 24-Runtime Serialization

This is a whole new chapter that was not in the 2nd Edition.

Part V – Threading

Chapter 25-Threading Basics

Whole new chapter motivating why Windows supports threads, thread overhead, CPU trends, NUMA Architectures, the relationship between CLR threads and Windows threads, the Thread class, reasons to use threads, thread scheduling and priorities, foreground thread vs background threads.

Chapter 26-Performing Compute-Bound Asynchronous Operations

Whole new chapter explaining the CLR’s thread pool. This chapter covers all the new .NET 4.0 constructs including cooperative cancelation, Tasks, the aralle class, parallel language integrated query, timers, how the thread pool manages its threads, cache lines and false sharing.

Chapter 27-Performing I/O-Bound Asynchronous Operations

Whole new chapter explaining how Windows performs synchronous and asynchronous I/O operations. Then, I go into the CLR’s Asynchronous Programming Model, my AsyncEnumerator class, the APM and exceptions, Applications and their threading models, implementing a service asynchronously, the APM and Compute-bound operations, APM considerations, I/O request priorities, converting the APM to a Task, the event-based Asynchronous Pattern, programming model soup.

Chapter 28-Primitive Thread Synchronization Constructs

Whole new chapter discusses class libraries and thread safety, primitive user-mode, kernel-mode constructs, and data alignment.

Chapter 29-Hybrid Thread Synchronization Constructs

Whole new chapter discussion various hybrid constructs such as ManualResetEventSlim, SemaphoreSlim, CountdownEvent, Barrier, ReaderWriterLock(Slim), OneManyResourceLock, Monitor, 3 ways to solve the double-check locking technique, .NET 4.0’s Lazy and LazyInitializer classes, the condition variable pattern, .NET 4.0’s concurrent collection classes, the ReaderWriterGate and SyncGate classes.

The plan is that this book WILL have an eBook available for it. I do not know more of the details just now.

(第2版に引き続き，第3版の翻訳も行われると良いなぁと願う人々が踏むべき過去の) 参考リンク

*1:著者である Jeffrey Richter 氏の名は，『Advanced Windows 第5版上』『Advanced Windows 第5版下』の原書の著者として，Win32 時代からの開発者にはお馴染みですね．

*2:Jeffrey Richter 氏と言えば，『なんで Advanced Windows の(原書の)電子ブック版は無くなったの？ - NyaRuRuの日記』なんて話も過去にはあっただけに，『CLR via C#, Third Edition』に eBook 版の予定があるというのは嬉しいニュースですね．

場所	"C:\Program Files\Microsoft Office Servers\12.0\Bin" (日本語ワードブレーカ nlsdata0011.dllが存在する場所)
ファイル名	"Custom0011.lex" (0011 は言語 ID)
文字コード	"Unicode"