meryngii.neta

今日も新たな"ネタ"を求めて。

unordered_set

Boost 1.36.0でUnorderedが入った。
unordered_mapは、mapに代わってデータのひも付けに使うことになると思う。
しかし、unordered_setの使い道があまり思いつかない。サンプルもなかなか見つからない。
ということで自分で書いてみた。お題は、文章中の単語を強調するプログラム。

#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <boost/unordered_set.hpp>

using std::string;

template <class ContainerT>
inline void show(const ContainerT& words, string& word)
{
	if (!word.empty())
	{
		if (words.find(word) != words.end())
			std::cout << '<' << word << '>';
		
		else
			std::cout << word;
		
		word.clear();
	}
}

int main()
{
	boost::unordered_set<string> words;
	
	{
		std::ifstream fs(L"words.txt", std::ios::in);
		
		std::string word;
		while (!fs.eof())
		{
			std::getline(fs, word);
			
			if (!word.empty())
				words.insert(word);
		}
	}
	
	std::string text;
	
	{
		std::ifstream fs(L"text.txt", std::ios::in);
		std::istreambuf_iterator<char> first(fs), last;
		text.assign(first, last);
	}
	
	{
		string word;
		string::const_iterator first = text.begin(), last = text.end();
		
		while (first != last)
		{
			char ch = *(first++);
			
			switch (ch)
			{
				case '\r': case '\n':
				case ' ': case '\t':
				case ',': case '.':
				case '\"': case '\'':
					break;
				
				default:
					word += ch;
					continue;
			}
			
			show(words, word);
			std::cout << ch;
		}
		
		show(words, word);
	}
	
	return 0;
}

words.txt

Design
Evolution
C++
language
free
story
simply

text.txt(D&EのC++ in 2005から引用)

"The Design and Evolution of C++", often called D&E, is the personal favorite among my books. Writing it, I was free of the usual rigid constraints of textbook and academic paper styles. There was essentially no precedent for writing a book retrospectively about the design of a language, so I could simply tell the story about how C++ came about, why it looks the way it does, and why alternative decisions were not taken.

結果

"The <Design> and <Evolution> of <C++>", often called D&E, is the personal favorite among my books. Writing it, I was <free> of the usual rigid constraints of textbook and academic paper styles. There was essentially no precedent for writing a book retrospectively about the design of a <language>, so I could <simply> tell the <story> about how <C++> came about, why it looks the way it does, and why alternative decisions were not taken.

Boost.TokenizerとかString Algorithmとか試したけれども、区切り文字を残してくれないので自分で書いた。
要素そのものに意味があるケースって、あんまり無いような。それはsetも同じか。でもsetもあまり使った事がない…。
unordered_multisetの用途は、さらによく分からない。