boost::locale::boundary::mapping< RangeIterator > Class Template Reference
[Boundary Analysis]

Class the holds boundary mapping of the text that can be used with iterators. More...

#include <boost/locale/boundary.hpp>

List of all members.

Public Types

typedef RangeIterator iterator
typedef
RangeIterator::base_iterator 
base_iterator
typedef std::iterator_traits
< base_iterator >::value_type 
char_type

Public Member Functions

 mapping (boundary_type type, base_iterator begin, base_iterator end, std::locale const &loc=std::locale())
 mapping (boundary_type type, base_iterator begin, base_iterator end, unsigned mask, std::locale const &loc=std::locale())
void map (boundary_type type, base_iterator begin, base_iterator end, std::locale const &loc=std::locale())
void map (boundary_type type, base_iterator begin, base_iterator end, unsigned mask, std::locale const &loc=std::locale())
 mapping ()
template<typename ORangeIterator>
 mapping (mapping< ORangeIterator > const &other)
template<typename ORangeIterator>
void swap (mapping< ORangeIterator > &other)
template<typename ORangeIterator>
mapping const & operator= (mapping< ORangeIterator > const &other)
unsigned mask () const
void mask (unsigned u)
RangeIterator begin () const
RangeIterator end () const

Friends

class break_iterator
class token_iterator
class mapping


Detailed Description

template<class RangeIterator>
class boost::locale::boundary::mapping< RangeIterator >

Class the holds boundary mapping of the text that can be used with iterators.

When the object is created in creates index and provides access to it with iterators. it is used mostly together with break_iterator and token_iterator. For each boundary point it provides the description mark of it that allows distinguish between different types of boundaries. For example it marks if sentence terminates because a mark like "?" or "." was found or because new line symbol is present in the text.

These marks can be read out with token_iterator::mark() and break_iterator::mark() member functions.

This class stores iterators to the original text, so you should be careful with iterators invalidation. If the iterators on original text are invalid you can't use this mapping any more.

Examples:

boundary.cpp, and wboundary.cpp.


Member Typedef Documentation

template<class RangeIterator>
typedef RangeIterator boost::locale::boundary::mapping< RangeIterator >::iterator

Iterator type that is used to iterate over boundaries

template<class RangeIterator>
typedef RangeIterator::base_iterator boost::locale::boundary::mapping< RangeIterator >::base_iterator

Underlying iterator that is used to iterate original text.

template<class RangeIterator>
typedef std::iterator_traits<base_iterator>::value_type boost::locale::boundary::mapping< RangeIterator >::char_type

The character type of the text


Constructor & Destructor Documentation

template<class RangeIterator>
boost::locale::boundary::mapping< RangeIterator >::mapping ( boundary_type  type,
base_iterator  begin,
base_iterator  end,
std::locale const &  loc = std::locale() 
) [inline]

Create a mapping of type type of the text in range [begin, end) using locale loc

template<class RangeIterator>
boost::locale::boundary::mapping< RangeIterator >::mapping ( boundary_type  type,
base_iterator  begin,
base_iterator  end,
unsigned  mask,
std::locale const &  loc = std::locale() 
) [inline]

Create a mapping of type type of the text in range [begin, end) using locale loc and set the boundaries mask to mask

template<class RangeIterator>
boost::locale::boundary::mapping< RangeIterator >::mapping (  )  [inline]

Default constructor of empty mapping

template<class RangeIterator>
template<typename ORangeIterator>
boost::locale::boundary::mapping< RangeIterator >::mapping ( mapping< ORangeIterator > const &  other  )  [inline]

Copy the mapping, note, you can copy the mapping that is used for token_iterator to break_iterator and vise versa.


Member Function Documentation

template<class RangeIterator>
void boost::locale::boundary::mapping< RangeIterator >::map ( boundary_type  type,
base_iterator  begin,
base_iterator  end,
std::locale const &  loc = std::locale() 
) [inline]

Create a mapping of type type of the text in range [begin, end) using locale loc

template<class RangeIterator>
void boost::locale::boundary::mapping< RangeIterator >::map ( boundary_type  type,
base_iterator  begin,
base_iterator  end,
unsigned  mask,
std::locale const &  loc = std::locale() 
) [inline]

Create a mapping of type type of the text in range [begin, end) using locale loc, and set a mask to mask

template<class RangeIterator>
template<typename ORangeIterator>
void boost::locale::boundary::mapping< RangeIterator >::swap ( mapping< ORangeIterator > &  other  )  [inline]

Swap the mappings, note, you swap the mappings between those that are used for token_iterator to break_iterator and vise versa. This operation invalidates all iterators.

template<class RangeIterator>
template<typename ORangeIterator>
mapping const& boost::locale::boundary::mapping< RangeIterator >::operator= ( mapping< ORangeIterator > const &  other  )  [inline]

Copy the mapping, note, you can copy the mapping that is used for token_iterator to break_iterator and vise versa.

template<class RangeIterator>
unsigned boost::locale::boundary::mapping< RangeIterator >::mask (  )  const [inline]

Get current boundary mask

template<class RangeIterator>
void boost::locale::boundary::mapping< RangeIterator >::mask ( unsigned  u  )  [inline]

Set current boundary mask.

This mask provides fine grained control on the type of boundaries and tokens you need to relate to. For example, if you want to find sentence breaks that are caused only by terminator like "." or "?" and ignore new lines, you can set the mask value sentence_term and break iterator would iterate only over boundaries that much this mask.

Note: the beginning of the text and the end of the text are always considered legal boundaries regardless if they have a mark that fits the mask.

For token iterator it means which kind of tokens should be selected. Please note that token iterator generally selects the biggest amount of text that has specific mark. This is especially relevant for word boundary analysis.

For example: if you set mask to word_any (selects numbers, letters) then when you iterate Over "To be, or not to be?" You would get "To", "be", "or", "not", "to", "be". You can request from token iterator to use wider type of selection by calling token_iterator::full_select(true) so it would select only "To", " be", ", or", " not", " to", " be" tokens. All depends on your actual needs. For word selection you would probably want the first (default) and for sentence selection the second.

Changing a mask does not invalidate current iterators but all new created iterators would not be compatible with old ones So you can't compare them, be careful with it.

template<class RangeIterator>
RangeIterator boost::locale::boundary::mapping< RangeIterator >::begin (  )  const [inline]

Get begin iterator used when object was created

template<class RangeIterator>
RangeIterator boost::locale::boundary::mapping< RangeIterator >::end (  )  const [inline]

Get end iterator used when object was created


The documentation for this class was generated from the following file:

Generated on Thu Mar 18 23:02:03 2010 for Boost.Locale by doxygen 1.5.6