﻿<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>学习日记 &#187; 屏幕取词</title>
	<atom:link href="https://www.softwareace.cn/?cat=64&#038;feed=rss2" rel="self" type="application/rss+xml" />
	<link>https://www.softwareace.cn</link>
	<description>时刻想着为自己的产品多做一些对他好的事情</description>
	<lastBuildDate>Fri, 20 Mar 2026 06:58:28 +0000</lastBuildDate>
	<language>zh-CN</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	
	<item>
		<title>tesseract_ocr 字符识别基础及训练字库、合并字库</title>
		<link>https://www.softwareace.cn/?p=1512</link>
		<comments>https://www.softwareace.cn/?p=1512#comments</comments>
		<pubDate>Sun, 11 Sep 2016 09:34:15 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[屏幕取词]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=1512</guid>
		<description><![CDATA[最近公司让我做文字串识别，通过查阅资料，谷歌的开源框架 tesseract-ocr可以帮助我们进行识别图像，文 [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>最近公司让我做文字串识别，通过查阅资料，谷歌的开源框架 tesseract-ocr可以帮助我们进行识别图像，文字等等，tesseract可以识别多种语言(一些常用的语言)，多种图片格式，非常强大。</p>
<p>首先体验一下tesseract的强大功能，先安装 tesseract_ocr ,下载地址为http://code.google.com/p/tesseract-ocr/，请务必下载3.0.1版本，我前面下的最新3.0.2版本，生成字符特征命令不能通过，最后勉强解决了，生成的字典识别出来的都是空字符</p>
<p>安装完成之后 看下根目录</p>
<p><a class="lightbox" href="http://image.lxway.com/upload/8/bb/8bb62211ff13e3bd00970b46683fbe47.jpg"><img title="tesseract_ocr 字符识别基础及训练字库、合并字库" src="http://image.lxway.com/upload/8/bb/8bb62211ff13e3bd00970b46683fbe47_thumb.jpg" alt="tesseract_ocr 字符识别基础及训练字库、合并字库" /></a></p>
<p>tessdata文件夹主要存放字典文件，只要把字典文件放进去，就可以用tesseract 识别相关语言的文字</p>
<p>现在先来识别一张图片</p>
<p><a class="lightbox" href="http://image.lxway.com/upload/3/c2/3c2943890b083c0e35ac2fb29e431195.jpg"><img title="tesseract_ocr 字符识别基础及训练字库、合并字库" src="http://image.lxway.com/upload/3/c2/3c2943890b083c0e35ac2fb29e431195_thumb.jpg" alt="tesseract_ocr 字符识别基础及训练字库、合并字库" /></a></p>
<p>把他放入任意一个文件夹，cmd 命令cd到图片放置的目录，然后执行</p><pre class="crayon-plain-tag">tesseract 1.jpg 1</pre><p>&nbsp;</p>
<p><a class="lightbox" href="http://image.lxway.com/upload/8/de/8de399d3d3341a8da353438152220518.jpg"><img title="tesseract_ocr 字符识别基础及训练字库、合并字库" src="http://image.lxway.com/upload/8/de/8de399d3d3341a8da353438152220518_thumb.jpg" alt="tesseract_ocr 字符识别基础及训练字库、合并字库" /></a></p>
<p><a class="lightbox" href="http://image.lxway.com/upload/2/be/2be57456f82c97260b1d294f2bdf5844.jpg"><img title="tesseract_ocr 字符识别基础及训练字库、合并字库" src="http://image.lxway.com/upload/2/be/2be57456f82c97260b1d294f2bdf5844_thumb.jpg" alt="tesseract_ocr 字符识别基础及训练字库、合并字库" /></a><br />
<a class="lightbox" href="http://image.lxway.com/upload/c/7a/c7a8f0e78e9bd8f0d4381a9e87aaa769.jpg"><img title="tesseract_ocr 字符识别基础及训练字库、合并字库" src="http://image.lxway.com/upload/c/7a/c7a8f0e78e9bd8f0d4381a9e87aaa769_thumb.jpg" alt="tesseract_ocr 字符识别基础及训练字库、合并字库" /></a></p>
<p>可以看到文件夹下 生成了一个txt文本，发现识别的效果并不是很理想。为啥呢，因为我所用的这个图片中的字有所变形，我们的图片和 tesseract 存在的 字做匹配，找相近的，但是字典中没有这种变形的字体，自然识别容易出错，为了 提高识别率，所以我们需要 训练一套 字体来提高识别率</p>
<p>训练 字库还需要一个工具jTessBoxEditor，下载地址为 http://sourceforge.net/projects/vietocr/files/jTessBoxEditor/</p>
<p>&nbsp;</p>
<p>现在我们来实战一下，首先要生成一个 .tif 的图片集，我们使用 jTessBoxEditor 来合并多张 格式为tif的图片</p>
<p>1、打开 jTessBoxEditor，选择tools-&gt;merge tif ,选择 tif图片，生成一个 格式为tif的 图片集</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>2、我生成一个名为 why4.tif 的图片集， 进入 cd进入 why4.tif 所在的目录，生成对应的 .box 文件</p>
<p>执行命令</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">tesseract why.tif  why4 batch.nochop makebox</pre><p>&nbsp;</p>
<p>&nbsp;</p>
<p>这个文件是通过 tesseract 识别出来的，标示了图片集中 文字的位置，大小，识别后的字符结果。</p>
<p>&nbsp;</p>
<p>3、调整，因为 tesseract 识别的不准确，所以我们用 jTessBoxEditor来调整识别文字的位置、结果。</p>
<p>用 jTessBoxEditor打开生成的图片集why4.tif ,注意 why4.tif 对应的box文件一定要和他处于同一个文件夹下(请保持文件名)，否则，用jTessBoxEditor打开没有 位置、识别结果等信息，然后就可以调整了，调整完之后保存</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>4、生成.tr文件</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">tesseract why4.tif  why4   nobatch box.train</pre><p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>5、计算字符集,从生成的 box文件中提取</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">unicharset_extractor why4.box</pre><p>&nbsp;</p>
<p>&nbsp;</p>
<p>6、生成字体特征文件，现在文件夹下新建任意文件名的 特征文件，里面的内容格式为</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">&amp;lt;fontname&amp;gt; &amp;lt;italic&amp;gt; &amp;lt;bold&amp;gt; &amp;lt;fixed&amp;gt; &amp;lt;serif&amp;gt; &amp;lt;fraktur&amp;gt;</pre><p>fontname为字体名称，保持和 图片集文件 .tif 和.box文件的前缀名一致 ，italic&gt; 、&lt;bold&gt; 、&lt;fixed&gt; 、&lt;serif&gt;、 &lt;fraktur&gt;的取值为1或0，表示字体是否具有这些属性。</p>
<p>&nbsp;</p>
<p>例如我新建了一个 名为 font，内容 为</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">why4 0 0 0 0 0</pre><p>的文件</p>
<p>输入命令</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">mftraining -F font -U unicharset why4.tr</pre><p>&nbsp;</p>
<p>7 、聚集tesseract 识别的训练文件</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">cntraining why4.tr</pre><p>&nbsp;</p>
<p>执行完这一步之后发现文件夹下生产了许多文件，把unicharset, inttemp, normproto, pfftable这几个文件加上前缀 why4.</p>
<p>&nbsp;</p>
<p>8、最后一步，合并相关文件，生成字典文件</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">combine_tessdata why4.</pre><p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>好了，至此字典文件就生产了，我们把生成的字典文件why4.traineddata放入到 tesseract_ocr 根目录下的 tessdata文件夹下</p>
<p>开始使用我们训练过得字体库</p>
<p>随便找一张图片测试一下</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">tesseract 13.jpg 13 -l why4</pre><p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>可以看到，效果好了许多</p>
<p>说了这么多，生成一个字库还是挺麻烦的，尤其是调整，看得我眼睛都花了，心情烦躁，好不容易做好了一个字库，但是不够 ，还要多添加一些训练内容进去该怎么办呢，经过我的研究，终于找到了3.0.1版本合并字库的方法</p>
<p>首先，需要 生成的字符集.tif文件，位置文件 .box ,只要有这两个文件在，就可以合并字典</p>
<p>好了，我现在有三个 需要合并的字典 why3 why4 why5，他他们的名字修改为 name.num 的形式，分别改为 why.3 why.4 why.5</p>
<p>1、先生成相对应的 .tr 文件</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">tesseract why.3.tif why.3 nobatch box.train
tesseract why.4.tif why.4 nobatch box.train
tesseract why.5.tif why.5 nobatch box.train</pre><p>2、从所有文件中提取字符</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">unicharset_extractor why.3.box why.4.box why.5.box</pre><p>3、生成字体特征文件</p>
<p>&nbsp;</p>
<p>新建的font文件中 把所有box文件对应的 字体特征都加进去</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">why.4 0 0 0 0 0
why.3 0 0 0 0 0
why.5 0 0 0 0 0</pre><p>&nbsp;</p><pre class="crayon-plain-tag">mftraining -F font -U unicharset why.3.tr why.4.tr why.5.tr</pre><p>4 、聚集所有.tr 文件</p>
<p>&nbsp;</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">cntraining why.3.tr why.4.tr why.5.tr</pre><p>6 、重命名文件，我把unicharset, inttemp, normproto, pfftable 这几个文件加了前缀why.</p>
<p>7、合并所有文件 生成一个大的字库文件</p>
<p>&nbsp;</p><pre class="crayon-plain-tag">combine_tessdata why.</pre><p></p>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=1512</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>wps的range对象</title>
		<link>https://www.softwareace.cn/?p=446</link>
		<comments>https://www.softwareace.cn/?p=446#comments</comments>
		<pubDate>Tue, 21 May 2013 09:00:52 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[屏幕取词]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=446</guid>
		<description><![CDATA[环境:xp sp3,vs2008,wps2012,word2010 如果直接设置Range对象的Start或E [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>环境:xp sp3,vs2008,wps2012,word2010</p>
<p>如果直接设置Range对象的Start或End属性可能会报异常，所以使用SetRange方法或者让Start小于等于Range的End值</p>
<p>word2010直接设置Start的值超过End不会跑出异常,但是会将End=Start</p><pre class="crayon-plain-tag">void Ctest09242Dlg::OnBnClickedOk()
{
	// TODO: 在此添加控件通知处理程序代码
	::CoInitialize(NULL);
	//OnOK();

	WPS::_ApplicationPtr g_app;
	CLSID clsid;
	HRESULT hr;
	hr=::CLSIDFromProgID(L&quot;WPS.Application&quot;,&amp;amp;clsid);    //通过ProgID取得CLSID

	try
	{
		g_app.CreateInstance(__uuidof(WPS::Application )) ;
	}
	catch (...)
	{
		AfxMessageBox(&quot;啊呀不是没装wps吧？&quot;);
		return;
	}

	WPS::DocumentsPtr docs =g_app-&amp;gt;GetDocuments();
	//内容:abcdefghijklmn
	CString sWord=&quot;e:\\1011.doc&quot;;

	WPS::_DocumentPtr p_doc;
	_bstr_t sNull;

	try
	{
		p_doc = docs-&amp;gt;Open(
			_bstr_t(sWord),
			VARIANT_FALSE,            // 确认转换
			VARIANT_TRUE,            // 只读
			VARIANT_FALSE,            // 添加到最近文件中
			sNull,                    // 文档口令.
			sNull,                    // 模板口令.
			VARIANT_FALSE,            // 恢复原状.
			sNull,                    // 写文档口令.
			sNull,                    // 写模板口令.
			0,                        // 格式.
			KSO::ksoEncodingAutoDetect,   // 编码
			VARIANT_TRUE,            // 可见
			VARIANT_FALSE,            // 打开并修复
			0,                        // DocumentDirection wdDocumentDirection LeftToRight
			VARIANT_FALSE            // 无编码对话框
			);

	}

	catch(_com_error)
	{
		g_app-&amp;gt;Quit(&amp;amp;vtMissing,&amp;amp;vtMissing,&amp;amp;vtMissing);
		return ;
	}

	g_app-&amp;gt;put_Visible(VARIANT_TRUE);

	WPS::SelectionPtr p_sel = p_doc-&amp;gt;Get_Selection();
	WPS::FindPtr p_fid = p_sel-&amp;gt;GetFind();

	CString sField = &quot;cde&quot;;
	_variant_t FindText=(LPCTSTR)sField;
	_variant_t ReplaceWith=vtMissing ;
	_variant_t Forward=VARIANT_TRUE;
	_variant_t Wrap=(_variant_t)(WPS::wpsFindContinue);
	_variant_t Format=VARIANT_FALSE;
	_variant_t MatchCase=VARIANT_FALSE;
	_variant_t MatchWholeWord=VARIANT_FALSE;

	_variant_t MatchWildcards=VARIANT_FALSE;
	_variant_t MatchSoundsLike=VARIANT_FALSE;
	_variant_t MatchAllWordForms=VARIANT_FALSE;

	VARIANT_BOOL bExec =  p_fid-&amp;gt;Execute(
		&amp;amp;FindText, &amp;amp;MatchCase, &amp;amp;MatchWholeWord, &amp;amp;MatchWildcards, &amp;amp;MatchSoundsLike,
		&amp;amp;MatchAllWordForms, &amp;amp;Forward, &amp;amp;Wrap, &amp;amp;Format, &amp;amp;ReplaceWith,&amp;amp;vtMissing);

	WPS::RangePtr lprage = p_sel-&amp;gt;GetRange();
	long ns = lprage-&amp;gt;Start;	//2
	long ne = lprage-&amp;gt;End;		//5
	CString str = lprage-&amp;gt;GetText();
	try
	{
		//lprage-&amp;gt;Start = 6;	//异常
		//lprage-&amp;gt;End = 1;		//异常
		lprage-&amp;gt;SetRange(6,7);	//正确
	}
	catch (_com_error&amp;amp; e)
	{
		CString ErrorStr;
		ErrorStr.Format( &quot;Code = %08lx\n\tCode meaning = %s\n\tSource = %s\n\tDescription = %s\n&quot;,
			e.Error(), e.ErrorMessage(), (LPCSTR)(_bstr_t)e.Source(), (LPCSTR)(_bstr_t)(e.Description()));
	}

	long ns1 = lprage-&amp;gt;Start;
	long ne1 = lprage-&amp;gt;End;
	str = (CString)(LPCSTR)lprage-&amp;gt;GetText();

	WPS::RangePtr lprage2 = p_sel-&amp;gt;GetRange();
	long ns2 = lprage2-&amp;gt;Start;
	long ne2 = lprage2-&amp;gt;End;
}</pre><p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=446</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>基于VS 2005环境的MS office自动化开发之熟悉环境篇</title>
		<link>https://www.softwareace.cn/?p=445</link>
		<comments>https://www.softwareace.cn/?p=445#comments</comments>
		<pubDate>Mon, 20 May 2013 04:10:25 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[Windows api]]></category>
		<category><![CDATA[屏幕取词]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=445</guid>
		<description><![CDATA[进行MS office自动化开发最好的工具或许是VBA，但是很多时候我们的应用系统往往和MS office的文 [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>进行MS office自动化开发最好的工具或许是VBA，但是很多时候我们的应用系统往往和MS office的文件打交道，比如生成word文件和EXCEL报表等等，因此学习一下主流的开发工具关于MS office自动化的知识还是很有必要的。从今天起，我将推出一些基于VS 2005环境的MS office自动化开发的一些入门文章。第一篇是关于熟悉开发环境。</p>
<p>这次我使用的是VS C++ 2005。经过一番折腾，我感觉VS C++ 2005对MS office自动化开发的支持还比不上VC 6.0。我这么说是有根据的，因为基于VC 6.0和基于VS 2005环境的MS office自动化开发我都做过。在VC 6.0，你只需要找到TpyeLib那个dll文件（或tlb文件、olb文件），选择你要用的类，然后VC 6.0为你自动生成.h文件和.cpp文件，这样你就可以直接使用其中定义的类了。</p>
<p>我原以为VS C++ 2005会沿用VC 6.0的做法。但我发现VS C++ 2005采用了新的做饭，这也倒没什么，问题严重的是这种新的方式居然对MS office自动化开发支持比较的糟糕（Word操作部分）。</p>
<p>进行MS office自动化开发，做法基本都是选择MS office的组件，然后导出你要使用的接口类，这在VC 6.0和VS C++ 2005都是一样。下面我以一个简单例子说明一下VS C++ 2005的具体做法，对应的ms office版本是office 2003。</p>
<p>&nbsp;</p>
<p>首先用VS 2005新建一个单文档工程Owner，然后为工程添加一个类，选择“TypeLib中的MFC类”，具体如下图：</p>
<p><img title="点击查看大图" alt="" src="http://www.cr173.com/up/2010-4/201043235146250.JPG" width="600" border="0" /></p>
<p>&nbsp;</p>
<p>单击“添加”按钮后出现下图，从下图我们看出VS 2005比VC 6.0的一点进步，这里多了一个导出来源：注册表。注册表的好处是名字直观，否则比如选择文件，你要操作word，你还要费一番心思去搜索操作word的类到底藏在哪个dll文件或olb文件、或tlb文件之中呢？</p>
<p>基于顾名思义的原则，我们在可用的类型库中选择Micro Word 11.0 Object Library&lt;8.3&gt;。这时出现一大堆接口类，我们也不知道该用哪个，干脆一古脑把它们全导入进来，具体如下图：</p>
<p><img title="点击查看大图" alt="" src="http://www.cr173.com/up/2010-4/201043235148130.JPG" width="600" border="0" /></p>
<p>&nbsp;</p>
<p>单击“完成”按钮后，你可以发现工程在一时之间出现了一大堆.h文件：CAddIn.h、CAddIns.h……你可能会疑惑，对应的cpp文件呢？对不起，VS 2005并没有为你生成。这时你或许感受到VS 2005和VC 6.0在office自动化开发的一点区别了吧。</p>
<p>既然一夜之间来了那么多新丁，先编译一下吧，OK，编译通过，这时你会说：VS 2005和VC 6.0的做法也不过大同小异罢了。先别那么早下结论，开始编码：</p>
<div>
<div><!--<br/  />
<br/  />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br/  />
http://www.CodeHighlighter.com/<br/  />
<br/  />
&#8211;>#include &#8220;CApplication.h&#8221; CApplication app;</div>


</div>


&nbsp;

先编译一下，这时出现了一些你想象不到的编译错误：

1&gt;f:\mytest\mytest\src\intdir\debug\owner\msword.tlh(1073) : error C2786: “BOOL (__stdcall *)(HDC,int,int,int,int)”: __uuidof 的操作数无效

1&gt;f:\mytest\mytest\src\intdir\debug\owner\msword.tlh(1073) : error C2923: “_com_IIID”:“Rectangle”不是参数“_Interface”的有效模板类型变量

1&gt;        c:\program files\microsoft visual studio 8\vc\platformsdk\include\wingdi.h(3514) : 参见“Rectangle”的声明

1&gt;f:\mytest\mytest\src\intdir\debug\owner\msword.tlh(1073) : error C3203: “_com_IIID”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_IIID”，应为real 类型

1&gt;f:\mytest\mytest\src\intdir\debug\owner\msword.tlh(7113) : warning C4003: “ExitWindows”宏的实参不足

1&gt;f:\mytest\mytest\src\intdir\debug\owner\msword.tlh(7113) : error C2059: 语法错误: “常量”

1&gt;f:\mytest\mytest\src\intdir\debug\owner\msword.tlh(13448) : error C2146: 语法错误: 缺少“;”(在标识符“Fonts”的前面)

1&gt;     f:\mytest\mytest\src\intdir\debug\owner\msword.tlh(13448) : error C4430: 缺少类型说明符- 假定为int。注意: C++ 不支持默认int

这时你可能会大呼：My God!这是怎么回事？抱歉，我也不知道为什么会出现这种问题。我初步估计为VS C++ 2005对MSWORD.olb这个组件支持得并不好，对其它一些组件可以支持。在这里我提供一个解决方案。在CApplication.h中将系统自动生成的：

<div>

<div><!--<br/  />
<br/  />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br/  />
http://www.CodeHighlighter.com/<br/  />
<br/  />
&#8211;>#import &#8220;C:\\Program Files\\Microsoft Office\\OFFICE11\\MSWORD.OLB&#8221; no_namespace</div>


</div>


&nbsp;

注释掉。然后添加如下代码：

<div>

<div><!--<br/  />
<br/  />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br/  />
http://www.CodeHighlighter.com/<br/  />
<br/  />
&#8211;>#import &#8220;C:\Program Files\Common Files\Microsoft Shared\Office11\MSO.DLL&#8221; #import &#8220;c:\Program Files\Common Files\Microsoft Shared\VBA\VBA6\VBE6EXT.olb&#8221; #import &#8220;c:\Program Files\Microsoft Office\Office11\MSWORD.olb&#8221; \ rename(&#8220;ExitWindows&#8221;,&#8221;_ExitWindows&#8221;) #import &#8220;c:\Program Files\Microsoft Office\Office11\EXCEL.exe&#8221; \     rename(&#8220;DialogBox&#8221;,&#8221;_DialogBox&#8221;) \     rename(&#8220;RGB&#8221;,&#8221;_RGB&#8221;) \     exclude(&#8220;IFont&#8221;,&#8221;IPicture&#8221;)</div>


</div>


&nbsp;

这样做之后再选择“重新生成”重新编译整个工程就可以顺利使用CApplication类了（注意是重新生成），尽管还有一些警告。

另外在每次编译时不时出现下面这个讨厌的对话框：

<img title="点击查看大图" alt="" src="http://www.cr173.com/up/2010-4/201043235150172.JPG" border="0" />

我还没找到办法去掉这个对话框。
</p>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=445</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to set text on “another” win32 application</title>
		<link>https://www.softwareace.cn/?p=420</link>
		<comments>https://www.softwareace.cn/?p=420#comments</comments>
		<pubDate>Wed, 15 May 2013 02:22:44 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[屏幕取词]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=420</guid>
		<description><![CDATA[I am using spy++ and see that the control I have has th [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>I am using spy++ and see that the control I have has the decimal that matches the hex(after conversion of course) in spy++ and I see the parent window matches as well so I have the IntPtr for a Label and IntPtr for the form/window but my <code>SendMessage</code> is not working to change the text in the target application.</p>
<p>Another approach may be may be to do something like this post but what is the control id and how do I get that <a href="http://stackoverflow.com/questions/1100605/settext-of-textbox-in-external-app-win32-api">SetText of textbox in external app. Win32 API</a></p>
<p>I assume the hWnd here needs to be the controls hWnd, correct?</p><pre class="crayon-plain-tag">SendMessageCall(hWnd, WM_SETTEXT, (IntPtr)value.Length, value);</pre><p>I notice that getting the text IS WORKING</p><pre class="crayon-plain-tag">SendMessageCall(hWnd, WM_GETTEXT, (IntPtr)sb.Capacity, sb);</pre><p>and I notice that I get the test, see the correct value, set the text yet it doesn&#8217;t change and then get the text again using <code>SendMessage</code> AND it is the new value but the application still shows the wrong value&#8230;.hmmm, do I need to send a repaint message maybe and if so, what is the code for that?</p>
<p>&nbsp;</p>
<p>You don&#8217;t send a window message to force repaint, instead you call <code>InvalidateRect(hWnd, NULL, TRUE)</code>.</p>
<p>&nbsp;</p>
<p><a href="http://stackoverflow.com/questions/8888710/how-to-set-text-on-another-win32-application">http://stackoverflow.com/questions/8888710/how-to-set-text-on-another-win32-application</a></p>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=420</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to get the password text in a text with password property from another process using C++ &#8211; 用C++如何从不同进程获取密码框文本</title>
		<link>https://www.softwareace.cn/?p=418</link>
		<comments>https://www.softwareace.cn/?p=418#comments</comments>
		<pubDate>Wed, 15 May 2013 01:58:25 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[屏幕取词]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=418</guid>
		<description><![CDATA[[crayon-69f0cc16b4db6801283871/] &#160;]]></description>
				<content:encoded><![CDATA[<p></p><pre class="crayon-plain-tag">CString str;
CString strTemp;
char buf[1024];
char cClassName[1024];
LONG lWndID;
CString strText;    
::GetClassName(hwnd,(LPSTR)&amp;cClassName,1024);
strTemp.Format("%s",cClassName);  
if("Edit" == strTemp)
{   
LONG lngWndStype;   
lngWndStype = GetWindowLong(hwnd,GWL_STYLE);
if(lngWndStype &amp; ES_PASSWORD)
{
    //char cTemp;
   int intPasswordChar;        
   //发送：EM_GETPASSWORDCHAR 一定要用：SendMessage，用PostMessage返回值不正确    
   intPasswordChar = SendMessage(hwnd,EM_GETPASSWORDCHAR,0,0);
   //不同进程发送：EM_SETPASSWORDCHAR消息，一定用PostMessage，不能用SendMessage，否则无效
   PostMessage(hwnd,EM_SETPASSWORDCHAR,0,0);
   UpdateWindow(hwnd);    //MessageBox(NULL,"Send OK","EM_SETPASSWORDCHAR",MB_OK);
   Sleep (100);//停止100毫秒,这点很重要
   //::SendMessage(hwnd,WM_GETTEXT,(WPARAM)1024,(LPARAM)strText.GetBuffer(0));
   ::SendMessage(hwnd,WM_GETTEXT,(WPARAM)1024,(LPARAM)buf);
   //不同进程发送：EM_SETPASSWORDCHAR消息，一定用PostMessage，不能用SendMessage，否则无效
   PostMessage(hwnd,EM_SETPASSWORDCHAR,intPasswordChar,0);
  }   
  else   
  {    
  //::SendMessage(hwnd,WM_GETTEXT,(WPARAM)1024,(LPARAM)strText.GetBuffer(0));
   ::SendMessage(hwnd,WM_GETTEXT,(WPARAM)1024,(LPARAM)buf);
   }     
}</pre><p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=418</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How can I get the text of another process&#8217; window?</title>
		<link>https://www.softwareace.cn/?p=417</link>
		<comments>https://www.softwareace.cn/?p=417#comments</comments>
		<pubDate>Wed, 15 May 2013 01:41:17 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[屏幕取词]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=417</guid>
		<description><![CDATA[As you&#8217;ve probably found out by now, calling GetW [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>As you&#8217;ve probably found out by now, calling <code>GetWindowText()</code> won&#8217;t work most of the time. The reason for this is that <code>GetWindowText()</code> won&#8217;t do the necessary translation between the address spaces of the two processes. This is required because address that the calling process passes to <code>GetWindowText()</code> in the <code>lpString</code> parameter is not valid in the address space of the target process, so some translation is required.</p>
<p>However, there is one way to get around it, and that is sending a <code>WM_GETTEXT</code> message to the target window. Now you might be wondering how this could work, if, after all, <code>GetWindowText()</code> sends a <code>WM_GETTEXT</code> message as part of its implementation.</p>
<p>The answer is that Windows treats some messages differently when they are sent directly across process boundaries, and provides support for address translation (which is not a translation at all. Windows uses memory mapped files to accomplish the copy). <code>WM_GETTEXT</code> is one of those, as are <code>WM_SETTEXT</code> and <code>WM_COPYDATA</code>.</p>
<p>Keep in mind, however, that this will not work for all windows, for the simple reason that they do not store their text using <code>WM_SETTEXT</code> and use their own buffers for it, but don&#8217;t handle the <code>WM_GETTEXT</code> message appropriately.</p>
<p>Finally, note that <code>GetWindowText()</code> <i>will</i> work under some circumstances, namely, when the target window passes <code>WM_SETTEXT</code> messages to <code>DefWindowProc()</code>. In this case, Windows holds the window text itself in internal structures, which happens to be saved in memory shared by all processes (a memory-mapped file), so <code>GetWindowText()</code> will retrieve the text directly, without needing to go across process boundaries.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=417</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Capture2Text</title>
		<link>https://www.softwareace.cn/?p=373</link>
		<comments>https://www.softwareace.cn/?p=373#comments</comments>
		<pubDate>Fri, 19 Apr 2013 09:47:04 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[屏幕取词]]></category>
		<category><![CDATA[ocr]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=373</guid>
		<description><![CDATA[Capture2Text Contents What is Capture2Text? Download Ho [&#8230;]]]></description>
				<content:encoded><![CDATA[<h1>Capture2Text</h1>
<form action="https://www.paypal.com/cgi-bin/webscr" method="post"><input type="image" alt="PayPal - The safer, easier way to pay online!" name="submit" src="https://www.paypalobjects.com/en_US/i/btn/btn_donate_LG.gif" /><img alt="" src="https://www.paypalobjects.com/en_US/i/scr/pixel.gif" width="1" height="1" border="0" /></form>
<h2><a name="toc"></a>Contents</h2>
<ul>
<li><a href="http://capture2text.sourceforge.net/#intro">What is Capture2Text?</a></li>
<li><a href="http://capture2text.sourceforge.net/#download">Download</a></li>
<li><a href="http://capture2text.sourceforge.net/#how_to_install">How to Install</a></li>
<li><a href="http://capture2text.sourceforge.net/#ocr">OCR</a></li>
<li><a href="http://capture2text.sourceforge.net/#speech">Speech Recognition</a></li>
<li><a href="http://capture2text.sourceforge.net/#output_options">Output Options</a></li>
<li><a href="http://capture2text.sourceforge.net/#configuration">Configuration</a></li>
<li><a href="http://capture2text.sourceforge.net/#substitutions">Substitutions</a></li>
<li><a href="http://capture2text.sourceforge.net/#command_line">Command Line Options</a></li>
</ul>
<h2><a name="intro"></a>What is Capture2Text?</h2>
<p>Capture2Text enables users to do the following:</p>
<ul>
<li>Optical Character Recognition (OCR) Allows the user to <i>quickly</i> snapshot a small portion of the screen, OCR it and (by default) save the result to the clipboard.</li>
<li>Speech Recognition  Using speech recognition the user can speak into their microphone and Capture2Text will convert the speech to text. If the speech recognition technology is not 100% sure, Capture2Text will present the user with a list of the most likely transcriptions. The selected result will (by default) be copied to the clipboard.</li>
</ul>
<p>Conceptual illustration:</p>
<p><img alt="" src="http://capture2text.sourceforge.net/images/ocr_and_voice.png" /></p>
<h2><a name="download"></a>Download</h2>
<p>The latest version can be found on the <a href="https://sourceforge.net/projects/capture2text/files/Capture2Text/">Capture2Text download page</a> hosted by SourceForge. Source code is included.</p>
<h2><a name="how_to_install"></a>How to Install</h2>
<ol>
<li>Unzip the contents of the zip file. Make sure that there are no Asian or other non-ASCII characters in the path where you unzipped it. Also, if you are on Windows 7, don&#8217;t unzip it to the Program Files directory (this will avoid issues related to write privileges).</li>
<li>Double-click on Capture2Text.exe. You should see the Capture2Text icon on the bottom-right of your screen (though it might be hidden in which case you will have to click on the &#8220;Show hidden icons&#8221; arrow).</li>
</ol>
<h2><a name="ocr"></a>OCR</h2>
<p>Capture2Text can OCR the following languages:</p>
<table border="0">
<tbody>
<tr>
<td>Afrikaans</td>
<td>Frankish</td>
<td>Maltese</td>
</tr>
<tr>
<td>Albanian</td>
<td>French</td>
<td>Norwegian</td>
</tr>
<tr>
<td>Ancient Greek</td>
<td>Galician</td>
<td>Polish</td>
</tr>
<tr>
<td>Arabic</td>
<td>German</td>
<td>Portuguese</td>
</tr>
<tr>
<td>Azerbaijani</td>
<td>Greek</td>
<td>Romanian</td>
</tr>
<tr>
<td>Basque</td>
<td>Hebrew</td>
<td>Russian</td>
</tr>
<tr>
<td>Belarusian</td>
<td>Hindi</td>
<td>Serbian</td>
</tr>
<tr>
<td>Bengali</td>
<td>Hungarian</td>
<td>Slovakian</td>
</tr>
<tr>
<td>Bulgarian</td>
<td>Icelandic</td>
<td>Slovenian</td>
</tr>
<tr>
<td>Catalan</td>
<td>Indonesian</td>
<td>Spanish</td>
</tr>
<tr>
<td>Cherokee</td>
<td>Italian</td>
<td>Swahili</td>
</tr>
<tr>
<td>Chinese</td>
<td>Japanese</td>
<td>Swedish</td>
</tr>
<tr>
<td>Croatian</td>
<td>Kannada</td>
<td>Tagalog</td>
</tr>
<tr>
<td>Czech</td>
<td>Korean</td>
<td>Tamil</td>
</tr>
<tr>
<td>Danish</td>
<td>Latvian</td>
<td>Telugu</td>
</tr>
<tr>
<td>Dutch</td>
<td>Lithuanian</td>
<td>Thai</td>
</tr>
<tr>
<td>English</td>
<td>Macedonian</td>
<td>Turkish</td>
</tr>
<tr>
<td>Esperanto</td>
<td>Malay</td>
<td>Ukrainian</td>
</tr>
<tr>
<td>Estonian</td>
<td>Malayalam</td>
<td>Vietnamese</td>
</tr>
<tr>
<td>Finnish</td>
<td>Maltese</td>
<td></td>
</tr>
</tbody>
</table>
<p>By default only Chinese, English, French, German, Japanese, and Spanish are installed.</p>
<p>To acquire other languages:</p>
<ol>
<li>Download the appropriate OCR language dictionaries from <a href="http://code.google.com/p/tesseract-ocr/downloads/list">http://code.google.com/p/tesseract-ocr/downloads/list</a>. These files end in &#8220;.tar.gz&#8221; (ex. tesseract-ocr-3.02.rus.tar.gz).</li>
<li>Open the &#8220;.tar.gz&#8221; file you just downloaded with 7-Zip or similar decompression software and navigate to the directory that has the file that ends in &#8220;.traineddata&#8221;.</li>
<li>Drag the &#8220;.traineddata&#8221; file (and any other file in this directory) to this path in the Capture2Text directory: Capture2Text\Utils\tesseract\tessdata</li>
<li>Restart Capture2Text</li>
</ol>
<p>Note: Arabic and Hindi are more CPU intentive and will thus be slower to OCR.</p>
<p><b>OCR Usage</b></p>
<p>Press the OCR capture key (default: Windows Key + Q) to start the capture. Now, using your mouse, resize the capture box over the area of the screen that you want to OCR. A preview of the captured OCR&#8217;d text will appear in the top-left corner of the screen. Press the capture key again or the left mouse button to complete the capture. The captured screen area will be OCR&#8217;d and the textual result will be stored in the clipboard by default.</p>
<p>To cancel an OCR capture, press Esc.</p>
<p>To move the capture box, hold down the right mouse button and drag the mouse.</p>
<p>To nudge the capture box, use the arrow keys.</p>
<p>To toggle the active capture box corner, press the space bar.</p>
<p>To change the OCR language, right-click the Capture2Text tray icon, select the OCR Language option and then select the desired language.</p>
<p>To quickly switch between 3 languages, use the OCR language quick access keys: Windows Key + 1, Windows Key + 2, and Windows Key + 3.</p>
<p>When the Tesseract versions of Chinese or Japanese is selected, you should specify the text direction (vertical or horizontal) using the text direction key: Windows Key + W. The text direction will not have any effect on the NHocr Chinese or NHocr Japanese dictionaries.</p>
<p>Using the Preferences dialog, you can change the following OCR settings:</p>
<div>
<ul>
<li>OCR Hotkeys.</li>
<li>Current OCR Language.</li>
<li>The 3 Quick-Access OCR Languages.</li>
<li>Capture Box color and opacity.</li>
<li>Enable/Disable the preview box and change its colors, font and opacity.</li>
<li>Change the text direction (used for Chinese and Japanese).</li>
</ul>
</div>
<h2><a name="speech"></a>Speech Recognition</h2>
<p>Capture2Text can perform speech recognition for the following languages:</p>
<table>
<tbody>
<tr>
<td>Afrikaans</td>
<td>French</td>
<td>Polish</td>
</tr>
<tr>
<td>Chinese</td>
<td>German</td>
<td>Portuguese</td>
</tr>
<tr>
<td>Czech</td>
<td>Italian</td>
<td>Russian</td>
</tr>
<tr>
<td>Dutch</td>
<td>Japanese</td>
<td>Spanish</td>
</tr>
<tr>
<td>English</td>
<td>Korean</td>
<td>Turkish</td>
</tr>
</tbody>
</table>
<p><b>Speech Recognition Usage</b></p>
<p>Press the speech recognition capture key (default: Windows Key + A) to start the capture. You will see a box that says &#8220;Recording&#8230;&#8221; in the top-left corner of your screen. Speak a word or phrase or sentence into your microphone. Capture2Text will automatically recognize when you are done speaking and will display a box that says &#8220;Analyzing&#8230;&#8221;. The speech recognition will take a couple of seconds. When the speech recognition is complete you will see a list of possible transcriptions to choose from. When you choose a transcription, it will be stored in the clipboard by default.</p>
<p>When the results windows is displayed, you can press Enter to select the first transcription or use the number keys (1-9) to select the corresponding transcription.</p>
<p>To cancel a speech recognition capture, press Esc.</p>
<p>To change the speech recognition language, right-click the Capture2Text tray icon, select the Speech Recognition Language option and then select the desired language.</p>
<p>To quickly toggle between 2 languages, use the speech recognition language hotkey: Windows Key + 4.</p>
<p>Using the Preferences dialog, you can change the following speech recognition settings:</p>
<div>
<ul>
<li>Speech recognition Hotkeys.</li>
<li>Current speech recognition Language.</li>
<li>The 2 speech recognition languages to toggle between.</li>
<li>The properties of the Results window (font, color, number of results).</li>
<li>How much silence to wait for before recording stops.</li>
</ul>
</div>
<h2><a name="output_options"></a>Output Options</h2>
<p>By default, the OCR&#8217;d or speech recognized text will be placed in the clipboard.</p>
<p>You also have 3 more ways to output the text.</p>
<p>To send the text to a pop-up window you can right-click the Capture2Text tray icon and select Show Popup Window.</p>
<p>To send the text to whichever textbox currently contains the blinking cursor/I-beam, right-click the Capture2Text tray icon and select Send to Cursor.</p>
<p>Advanced: To send the text directly to a window/control (for example, Notepad++), first fill in the Send to Control settings in the Preferences dialog. Once this is done you may enable/disable the option by right-clicking the Capture2Text tray icon and selecting Send to Control.</p>
<p>Using the Preferences dialog, you can change the following output settings:</p>
<div>
<ul>
<li>Text to prepend/append to the captured text.</li>
<li>Enable/Disable outputting to the clipboard.</li>
<li>Enable/Disable outputting to a popup window.</li>
<li>Popup window properties (default width and height).</li>
<li>Enable/Disable sending the output text to the cursor.</li>
<li>Enable/Disable outputting to a control.</li>
<li>Additional command to send to the output control.</li>
</ul>
</div>
<h2><a name="configuration"></a>Configuration</h2>
<p>Right-click the Capture2Text tray icon in the bottom-right of your screen and then select the &#8220;Preferences&#8230;&#8221; option to bring up the Preferences dialog.</p>
<h2><a name="substitutions"></a>Substitutions</h2>
<p>Sometimes Capture2Text consistently makes the same OCR mistakes such as recognizing an &#8220;M&#8221; as &#8220;I\/|&#8221;.</p>
<p>By editing the subtitutions.txt file in the Capture2Text directory, you may tell Capture2Text to substitute one text string for another text string.</p>
<p>Just find the appropriate language section and add one substitution per line in this format: from_text = to_text</p>
<p>Example (adding 3 substitutions to the English section):</p>
<dl>
<dt>English:</dt>
<dd>I\/| = M</dd>
<dd>&gt;&lt; = X</dd>
<dd>some%space%text = some_text</dd>
</dl>
<p>&nbsp;</p>
<p>To create a substitution regardless of language, add the substitution to the &#8220;All:&#8221; section.</p>
<p>Special tokens and escape characters:</p>
<table>
<tbody>
<tr>
<td>%space%</td>
<td>Space character</td>
</tr>
<tr>
<td>%tab%</td>
<td>Tab character</td>
</tr>
<tr>
<td>%eq%</td>
<td>Equals (=)</td>
</tr>
<tr>
<td>%perc%</td>
<td>Percent sign (%)</td>
</tr>
<tr>
<td>%lf%</td>
<td>Linefeed character (\n)</td>
</tr>
<tr>
<td>%cr%</td>
<td>Carriage return character (\r)</td>
</tr>
</tbody>
</table>
<p>You may disable a substitution by adding a &#8220;#&#8221; in front.</p>
<p>When done editing substitutions.txt, either restart Capture2Text or switch language for the substitutions to take effect.</p>
<h2><a name="command_line"></a>Command Line Options</h2>
<p>You may OCR the screen via command line by calling Capture2Text in this format:</p>
<p>Capture2Text.exe x1 y1 x2 y2 [output_file]</p>
<dl>
<dt>Required Arguments:</dt>
<dd>x1 &#8211; X1-Coordinate of the screen</dd>
<dd>y1 &#8211; Y1-Coordinate of the screen</dd>
<dd>x2 &#8211; X2-Coordinate of the screen</dd>
<dd>y2 &#8211; Y2-Coordinate of the screen</dd>
<dt>Optional Arguments:</dt>
<dd>output_file &#8211; The OCR&#8217;d text will be written to this file if specified.</dd>
</dl>
<p>Capture2Text will read settings.ini to determine settings such as OCR language and output options (clipboard, popup, etc.).</p>
<dl>
<dt>Examples:</dt>
<dd>Capture2Text.exe 10 152 47 321 output.txt</dd>
<dd>Capture2Text.exe 10 152 47 321</dd>
</dl>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=373</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>利用开源程序（ImageMagick+tesseract-ocr）实现图像验证码识别</title>
		<link>https://www.softwareace.cn/?p=366</link>
		<comments>https://www.softwareace.cn/?p=366#comments</comments>
		<pubDate>Thu, 18 Apr 2013 05:34:01 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[屏幕取词]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=366</guid>
		<description><![CDATA[开源的力量是巨大的，借助于斯，我这个对验证码一窍不通的人也可以识别出很多基础的验证码了。 &#8212;&#8 [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>开源的力量是巨大的，借助于斯，我这个对验证码一窍不通的人也可以识别出很多基础的验证码了。</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;低调的分割线&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<p>Linux下有两个重要的编程准则，甚至是设计哲学，就是：模块原则（使用简洁的借口拼合简单的部件）和组合原则（设计时考虑拼接组合）。在Linux下面有无数个小程序，体积小，功能简单。但是当我们将它们按一定的方式组合起来以后，它们 几乎无所不能。命令行的一个很大的好处就是方便组合。试想一下你要处理一万个文本文件，并替换其中的部分内容，如果是使用图形界面的Word，恐怕没有人能够干的下来。 　　今天我们要用到两个开源软件：ImageMagick+tesseract-ocr。</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;ImageMagick&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<p>首先是一点简介（英文原文源于<a href="http://www.imagemagick.org/script/index.php" target="_blank">官方网站</a>）：</p>
<p>ImageMagick® is a software suite to create, edit, and compose bitmap images. It can read, convert and write images in a variety of formats (over 100). 　　ImageMagick是一个适用于创建、编辑和组合位图的软件。它能够读、写和转换超过百余种格式的图片。</p>
<p>The functionality of ImageMagick is typically utilized from the command line or you can use the features from programs written in your favorite programming language. Choose from these interfaces: G2F (Ada), MagickCore (C), MagickWand (C), ChMagick (Ch), ImageMagickObject (COM+), Magick++ (C++), JMagick (Java), L-Magick (Lisp), NMagick (Neko/haXe), MagickNet (.NET), PascalMagick (Pascal), PerlMagick (Perl), MagickWand for PHP (PHP), IMagick (PHP), PythonMagick (Python), RMagick (Ruby), or TclMagick (Tcl/TK).  　　另外，ImageMagick针对主流的编程语言都有借口，包括G2F (Ada), MagickCore (C), MagickWand (C), ChMagick (Ch), ImageMagickObject (COM+), Magick++ (C++), JMagick (Java), L-Magick (Lisp), NMagick (Neko/haXe), MagickNet (.NET), PascalMagick (Pascal), PerlMagick (Perl), MagickWand for PHP (PHP), IMagick (PHP), PythonMagick (Python), RMagick (Ruby), 和 TclMagick (Tcl/TK)。当然，你也可以通过命令行的方式将它与其它程序组合起来。</p>
<p>ImageMagick is free software delivered as a ready-to-run binary distribution or as source code that you may freely use, copy, modify, and distribute in both open and proprietary applications. It is distributed under an Apache 2.0-style license. 　　ImageMagick是一个开源软件，以可运行的二进制文件和源代码两种方式发布。你可以在公开和私有的程序中随意地使用、复制、修改和分发它。它基于Apache 2.0风格的协议发布。</p>
<p>其次，貌似ImageMagick的官方网站是被功夫墙了的（这可是纯技术的网站啊！），所以我们无法直接去获取该程序，这里是国内的<a href="http://www.duote.com/soft/8047.html#download" target="_blank">下载</a>。</p>
<p>最后是安装，没的什么说的，最简单一路next就可以，当然你也可以改改安装目录啥的。放心，没有捆绑百度工具栏的~</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;tesseract-ocr&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<p>先来介绍下tesseract-ocr，老规矩，英文原文源于<a href="http://code.google.com/p/tesseract-ocr/">官方网站</a>（你没有点错，这个网站是没有被墙的）：</p>
<p>An OCR Engine that was developed at HP Labs between 1985 and 1995&#8230; and now at Google. 　　tesseract-ocr是一个OCR（Optical Character Recognition，光学字符识别）引擎，最初由惠普实验室在1985到1995年间开发维护，现在归Google管了。</p>
<p>The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.  　　tesseract-ocr引擎曾是1995年UNLV准确度测试中最顶尖的三个引擎之一。在1995年到2006年期间，它几乎没有什么改动，但是它可能仍然是现在最准确的开源OCR引擎之一。它（原文是source code，源代码，应该是笔误）会读取二进制的灰度或者彩色的图像，并输出文字。一个内建的tiff阅读器让它可以读取未压缩的TIFF图像，但是如果要读取压缩过的TIFF图像，它还需要一个附加的libtiff库。</p>
<p>由于官方没有被封，直接在官网就可以<a href="http://code.google.com/p/tesseract-ocr/downloads/list" target="_blank">下载</a>了。我们需要下载tesseract-2.04.exe.tar.gz和tesseract-2.00.eng.tar.gz。tesseract-2.04.exe.tar.gz是主程序。tesseract-2.00.eng.tar.gz是识别英文和数字需要用的特征库，有点类似于杀毒软件的病毒库。tesseract-ocr还可以识别荷兰语、西班牙语和德语等等等等，我们用不着就不用下了。</p>
<p>最后，这个软件是不用安装的，解压就可以用了。先解压tesseract-2.04.exe.tar.gz，然后解压tesseract-2.00.eng.tar.gz的内容到tesseract的根目录，就可以了。如果解压tesseract-2.00.eng.tar.gz的位置没有放好，运行tesseract 会出错：Unable to load unicharset file ./tessdata/eng.unicharset。</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;验证码识别&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<p>两个软件的关系：</p>
<p>tesseract是图盲，默认情况下只能看得懂未压缩的TIFF图像，如果直接用tesseract处理其它格式的图片，会报错如下： 　　　　Tesseract Open Source OCR Engine 　　　　name_to_image_type:Error:Unrecognized image type:code.jpg 　　　　IMAGE::read_header:Error:Can’t read this image type:code.jpg 　　　　tesseract:Error:Read of file failed:code.jpg</p>
<p>所以我们需要用ImageMagick来转换图片格式，当然ImageMagick还有其它用处。</p>
<p>假设需要识别的图片验证码为code.jpg，我们需要做的只有两步：</p>
<p>d:\ImageMagick\convert.exe -compress none -depth 8 -alpha off ./code.gif ./code.tif 　　　　D:\\tesseract\\tesseract.exe ./code.tif ./result</p>
<p>OK，结果就在文本文件./result.txt里面了，tesseract会自动地在./result后面添加上后缀名.txt。然后再对两个命令做点解释。</p>
<p>convert.exe：ImageMagick套件的一部分，负责图片格式转换，各个参数的意义如下： 　　　　-compress none：转换后的图片不要压缩，如果没有加这一项，后续tesseract处理的时候会报错：read_tif_image:Error:Illegal image format:Compression 　　　　-depth 8：设置转换后图像的色深为8位，也就是bpp为8。如果没有此参数，后果如下： 　　　　　　Tesseract Open Source OCR Engine 　　　　　　check_legal_image_size:Error:Only 1,2,4,5,6,8 bpp are supported:16 　　　　　　Segmentation fault 　　　　-alpha off：在转换后的图像中不要添加alpha图层。如果没有此参数，后果同上。  　　　　紧跟着就是待转换的图片的文件名，最后是转换后的图片的文件名。</p>
<p>tesseract.exe：OCR就这样被我们“滥用”做验证码识别了~。 　　　　./code.tif：待识别的图像 　　　　./result：存放结果的文件的文件名，tesseract会自动在其后添加后缀.txt。</p>
<p>就这么简单，仅仅两个命令，验证码的内容就乖乖地在result文件中等我们了。</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-优化大法&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;</p>
<p>在<a href="http://www.huangshifu.net/2010/01/29/ocr-stuff.html" target="_blank">黄师傅的博客</a>看到了一些可能的优化方法（未验证），记录如下：</p>
<p>为提高识别率，可以先把图片转换为灰度。即弄黑白的：在convert的时候加上参数-monochrome（单色，非黑即白）或者-colorspace Gray（灰度图，黑的程度还会不一样哦，效果会好点）。</p>
<p>做放大处理（以150%为例）：convert in.tif -scale 150% in2.tif</p>
<p>如果要裁剪图像，使用参数-crop从一个图片截取一个指定区域的子图片【参见<a href="http://juggler.javaeye.com/blog/28009" target="_blank">这里</a>】。格式如下：-crop widthxheight{+-}x{+-}y{%}，width 子图片宽度，height 子图片高度，x 为正数时为从区域左上角的x坐标,为负数时,左上角坐标为0,然后从截出的子图片右边减去x象素宽度，y 为正数时为从区域左上角的y坐标,为负数时,左上角坐标为0,然后从截出的子图片上边减去y象素高度。</p>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=366</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>开源OCR引擎Tesseract</title>
		<link>https://www.softwareace.cn/?p=365</link>
		<comments>https://www.softwareace.cn/?p=365#comments</comments>
		<pubDate>Thu, 18 Apr 2013 05:09:09 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[屏幕取词]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=365</guid>
		<description><![CDATA[知名的开源OCR引擎Tesseract 3.0版本日前发布，可以在项目网站下载：http://code.goo [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>知名的开源OCR引擎Tesseract 3.0版本日前发布，可以在项目网站下载：<a href="http://code.google.com/p/tesseract-ocr">http://code.google.com/p/tesseract-ocr</a>, 新版本支持中文,中文语言包定义<a title="http://code.google.com/p/tesseract-ocr/downloads/detail?name=chi_sim.traineddata.gz" href="http://code.google.com/p/tesseract-ocr/downloads/detail?name=chi_sim.traineddata.gz">http://code.google.com/p/tesseract-ocr/downloads/detail?name=chi_sim.traineddata.gz</a>。</p>
<p>Tesseract是<a href="http://research.google.com/pubs/author4479.html">Ray Smith</a>于 1985到1995年间在惠普布里斯托实验室开发的一个OCR引擎，曾经在1995 UNLV精确度测试中名列前茅。但1996年后基本停止了开发。2006年，Google邀请Smith加盟，重启该项目。目前项目的许可证是 Apache 2.0。该项目目前支持Windows、Linux和Mac OS等主流平台。但作为一个引擎，它只提供命令行工具。</p>
<p>执行识别图像的命令格式为：  tesseract &lt;imagename&gt;   &lt;outputbase&gt; [-l lang] [configfile [[+|-]varfile]&#8230;]  其 中tesseract是命令；&lt;imagename&gt;是待识别的图片，例如图片 eurotext.tif；&lt;outputbasename&gt;是输出文本文件的名称，默认生成的是你所给定的输出文件名称，加上.txt扩展 名；[-l lang]可选的，指定识别图像中的语言。</p>
<p>Tesseract还有相应的.net版本，下载地址如下：<a href="http://www.pixel-technology.com/freeware/tessnet2/">http://www.pixel-technology.com/freeware/tessnet2/</a>。 另外发现这个用法非常简单，注意还需要下载语言包，另外为了提高验证率，还可以自己进行训练，tesseract-OCR还支持训练功能，以提高(对不同 字体的)识别效率或者对新语种的支持。大致就是通过给定的包含已知字符的tiff文件生成相应的box文件，经过手工更正后，训练tesseract- OCR的识别能力。也可以用一些<a href="http://www.ub-filosofie.ro/~solcan/wt/gnu/t/tbe.html">训练工具</a>完成这个过程。</p>
<p>Tesseract是图盲，默认情况下只能看得懂未压缩的TIFF图像，如果直接用tesseract处理其它格式的图片，会报错如下：  Tesseract Open Source OCR Engine  name_to_image_type:Error:Unrecognized image type:code.jpg  IMAGE::read_header:Error:Can’t read this image type:code.jpg  tesseract:Error:Read of file failed:code.jpg</p>
<p>所 以我们需要用ImageMagick来转换图片格式，ImageMagick (TM) 是一个免费的创建、编辑、合成图片的软件。它可以读取、转换、写入多种格式的图片。图片切割、颜色替换、各种效果的应用，图片的旋转、组合，文本，直线， 多边形，椭圆，曲线，附加到图片伸展旋转。ImageMagick是免费软件：全部源码开放，可以自由使用，复制，修改，发布。它遵守GPL许可协议。它 可以运行于大多数的操作系统。ImageMagick的大多数功能的使用都来源于命令行工具。通常来说，它可以支持以下程序语言： Perl, C, C++, Python, PHP, Ruby, Java；现成的ImageMagick接口(PerlMagick, Magick++, PythonMagick, MagickWand for PHP, RubyMagick, and JMagick)是可利用的。这使得自动的动态的修改创建图片变为可能。ImageMagick支持至少90种图片格式: A, ART, AVI, AVS, B, BIE, BMP, BMP2, BMP3, C, CACHE, CAPTION, CIN, CIP, CLIP, CLIPBOARD, CMYK, CMYKA, CUR, CUT, DCM, DCX, DNG, DOT, DPS, DPX, EMF, EPDF, EPI, EPS, EPS2, EPS3, EPSF, EPSI, EPT, EPT2, EPT3, FAX, FITS, FPX, FRACTAL, G, G3, GIF, GIF87, GRADIENT, GRAY, HDF, HISTOGRAM, HTM, HTML, ICB, ICO, ICON, JBG, JBIG, JNG, JP2, JPC, JPEG, JPG, JPX, K, LABEL, M, M2V, MAP, MAT, MATTE, MIFF, MNG, MONO, MPC, MPEG, MPG, MSL, MTV, MVG, NULL, O, OTB, P7, PAL, PALM, PATTERN, PBM, PCD, PCDS, PCL, PCT, PCX, PDB, PDF, PFA, PFB, PGM, PGX, PICON, PICT, PIX, PJPEG, PLASMA, PNG, PNG24, PNG32, PNG8, PNM, PPM, PREVIEW, PS, PS2, PS3, PSD, PTIF, PWP, R, RAS, RGB, RGBA, RGBO, RLA, RLE, SCR, SCT, SFW, SGI, SHTML, STEGANO, SUN, SVG, SVGZ, TEXT, TGA, TIF, TIFF, TILE, TIM, TTC, TTF, TXT, UIL, UYVY, VDA, VICAR, VID, VIFF, VST, WBMP, WMF, WMFWIN32, WMZ, WPG, X, XBM, XC, XCF, XPM, XV, XWD, Y, YCbCr, YCbCrA, YUV，具体参考<a title="http://www.imagemagick.com.cn/" href="http://www.imagemagick.com.cn/">http://www.imagemagick.com.cn/</a>。</p>
<h4><a name="dot-net"></a>ImageMagick .NET的相关项目：</h4>
<blockquote><p>Use <a href="http://imagemagick.codeplex.com/">MagickNet</a> to convert, compose, and edit images from Windows .NET.</p>
<p><a href="http://sourceforge.net/projects/imagemagickapp/">ImageMagickApp</a> is a .NET application written in C# that utilizes the ImageMagick command line to allow conversion of multiple image formats to different formats.</p></blockquote>
<p>假设需要识别的图片验证码为code.jpg，我们需要做的只有两步：</p>
<p>d:\ImageMagick\convert.exe -compress none -depth 8 -alpha off ./code.gif ./code.tif  D:\\tesseract\\tesseract.exe ./code.tif ./result</p>
<p>结果就在文本文件./result.txt里面了，tesseract会自动地在./result后面添加上后缀名.txt。然后再对两个命令做点解释。</p>
<p>convert.exe：ImageMagick套件的一部分，负责图片格式转换，各个参数的意义如下：  -compress none：转换后的图片不要压缩，如果没有加这一项，后续tesseract处理的时候会报错：read_tif_image:Error:Illegal image format:Compression  -depth 8：设置转换后图像的色深为8位，也就是bpp为8。如果没有此参数，后果如下：  Tesseract Open Source OCR Engine  check_legal_image_size:Error:Only 1,2,4,5,6,8 bpp are supported:16  Segmentation fault  -alpha off：在转换后的图像中不要添加alpha图层。如果没有此参数，后果同上。  紧跟着就是待转换的图片的文件名，最后是转换后的图片的文件名。</p>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=365</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Tesseract OCR开源项目</title>
		<link>https://www.softwareace.cn/?p=364</link>
		<comments>https://www.softwareace.cn/?p=364#comments</comments>
		<pubDate>Thu, 18 Apr 2013 04:56:37 +0000</pubDate>
		<dc:creator><![CDATA[admin]]></dc:creator>
				<category><![CDATA[屏幕取词]]></category>
		<category><![CDATA[ocr]]></category>

		<guid isPermaLink="false">http://www.softwareace.cn/?p=364</guid>
		<description><![CDATA[最近，项目中需要使用基于图像识别验证码的技术，初步探索尝试了一下开源的Tesseract OCR项目。该项目简 [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>最近，项目中需要使用基于图像识别验证码的技术，初步探索尝试了一下开源的Tesseract OCR项目。该项目简介如下：</p>
<p>This package contains the Tesseract Open Source OCR Engine. Orignally developed at Hewlett Packard Laboratories Bristol and at Hewlett Packard Co, Greeley Colorado.</p>
<p>The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.</p>
<p>Tesseract 是一款开源的光学字符串识别（OCR）项目，能够识别图像验证码。比如存在一个格式为TIF的文字图片，Tesseract能够识别出该图片中的文字，将识别到的文字写入到一个文本文件中，识别效果很不错。如果想要识别不同语言的文字图像，需要下载响应的支持包，才能让Tesseract识别更多格式的图像。</p>
<p>Tesseract项目地址为：<a href="http://code.google.com/p/tesseract-ocr/">http://code.google.com/p/tesseract-ocr/</a>，可以通过下载开源发行包，或者到该项目网站了解更多信息。</p>
<p>下载当前较新的2.0.4版本，下载地址为<a href="http://tesseract-ocr.googlecode.com/files/tesseract-2.04.tar.gz">http://tesseract-ocr.googlecode.com/files/tesseract-2.04.tar.gz</a>。我不清楚，是否是我所在的网络有问题，下载过程中数据包丢失，还是其它原因，按照该项目网站上说明，没有成功安装好，经过仔细阅读文档及其项目网站上的FAQ，终于找到了问题的原因。现在把配置过程简单做个记录。</p>
<p>下载完成的压缩包为tesseract-2.04.tar.gz，我是直接在Fedora Core 7 Linux系统下，使用root权限在root目录下解压缩的，可以看到解压缩目录为tesseract-2.04，该目录下有很多文件，比较杂。下面开始执行安装过程：</p>
<p><strong>1、编译Tesseract</strong></p>
<p>估计下载下来的tesseract-2.04.tar.gz包解压以后，目录tesseract-2.04下的文件全是read-only的，需要修改一下文件操作权限：</p>
<p>[root@bogon tesseract-2.04]# chmod 777 -R *</p>
<p>然后，默认执行下面三个命令，配置、编译、安装：</p>
<p>[root@bogon tesseract-2.04]# ./configure [root@bogon tesseract-2.04]# make [root@bogon tesseract-2.04]# make install</p>
<p>可能需要花一点时间才能完成。</p>
<p><strong>2、配置语言包</strong></p>
<p>上面默认安装到了/usr/local/share/tessdata目录下，先到该目录下检查一下，如果里面的文件（不包含configs和tessconfigs目录）大小都是0字节，说明存在问题了，如果你执行启动Tesseract OCR引擎，就会出现如下异常：</p>
<p>Unable to load unicharset file /usr/local/share/tessdata/eng.unicharset</p>
<p>肯定会有问题，文件/usr/local/share/tessdata/eng.unicharset是空的，无法加载。再到/root/tesseract-2.04/tessdata目录中，检查一下如果里面的文件（不包含configs和tessconfigs目录）大小都是0字节，就需要单独下载，其实我感觉，之所以导致/usr/local/share/tessdata目录下文件为空，原因可能是，在上面执行安装过程中，/root/tesseract-2.04/tessdata目录中文件无效导致安装操作将一些空文件拷贝到了/usr/local/share/tessdata目录下，从而失败。</p>
<p>考虑单独下载语言包，下载<a href="http://tesseract-ocr.googlecode.com/files/tesseract-2.00.eng.tar.gz">http://tesseract-ocr.googlecode.com/files/tesseract-2.00.eng.tar.gz</a>后得到解压缩文件目录tessdata，将目录中的8个非空文件拷贝到/usr/local/share/tessdata目录下覆盖掉原来的空文件，就可以了。</p>
<p><strong>3、启动Tesseract OCR引擎，识别图像</strong><br />
现在，可以准备要进行识别的图像文件，我使用了Tesseract项目发行包中一个TIF图像文件：</p>
<p>执行识别图像的命令格式为：<br />
tesseract &lt;imagename&gt;   &lt;outputbase&gt; [-l lang] [configfile [[+|-]varfile]&#8230;]<br />
其中tesseract是命令；&lt;imagename&gt;是待识别的图片，例如图片eurotext.tif；&lt;outputbasename&gt;是输出文本文件的名称，默认生成的是你所给定的输出文件名称，加上.txt扩展名；[-l lang]可选的，指定识别图像中的语言。</p>
<p>例如，启动Tesseract OCR 引擎，识别文字图片eurotext.tif ，执行命令：</p>
<p>[root@bogon tesseract-2.04]# tesseract eurotext.tif eurotext Tesseract Open Source OCR Engine [root@bogon tesseract-2.04]#</p>
<p>可以在tesseract-2.04目录下看到识别图像文件eurotext.tif 得到对应的文本文件eurotext.txt，内容如下所示：</p>
<p>The (quick) [brown] {fox} jumps! Over the $43,456.78 &lt;lazy&gt; #90 dog &amp; duck/goose, as 12.5% of E-mail from aspammer@website.com is spam. Der ,,schnelle&#8221; braune Fuchs springt uber den faulen Hund. Le renard brun &lt;&lt;rapide» saute par-dessus le chien paresseux. La volpe marrone rapida salta sopra il cane pigro. El zorro marron répido salta sobre el perro perezoso. A raposa marrom répida salta sobre o cio preguicoso.</p>
<p>可见，识别正确率还是很高的，如果你使用发行包中自带的phototest.tif图像文件，识别正确率肯定是100％。但是，因为该图片中存在的干扰信息还是很弱的，不能妄言其识别正确率的高低，还有待于进一步测试它。</p>
]]></content:encoded>
			<wfw:commentRss>https://www.softwareace.cn/?feed=rss2&#038;p=364</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
