Shard Detail

tinysegmenter v0.1.0

Tokenizer for Japanese text, written in Crystal

Install & Use

Add the following code to your project's shard.yml under:

dependencies to use in production
- OR -
development_dependencies to use in development


tinysegmenter:
  github: spencerking/tinysegmenter

Readme

tinysegmenter

Crystal port of TinySegmenter.js for tokenizing Japanese text.

Forked from: https://qiita.com/ikasamt/items/471bfae96ce590a4fe82

Installation

  1. Add the dependency to your shard.yml:

    dependencies:
      tinysegmenter:
        github: your-github-user/tinysegmenter
    
  2. Run shards install

Usage

require "TinySegmenter"

corpus = File.read("./timemachineu8j.txt")
results = TinySegmenter.tokenize(corpus)
puts results

Development

TODO: Write development instructions here

Contributing

  1. Fork it (https://github.com/spencerking/tinysegmenter/fork)
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Contributors