.. _install: Installing polymr ################# Getting the latest release of polymr is a snap. .. code:: bash pip install polymr To get the full source distribution, including extra storage backends, tests, and documentation, clone the github repository:: git clone https://github.com/massmutual/polymr cd polymr To run the tests for polymr:: python setup.py test To build the docs:: cd doc make html Using the Python API #################### Interacting with the polymr API is best shown by example. The ``data`` directory of the source repository contains a CSV of the senators serving in the 190th Massachussetts general court. The examples below will index, query, and modify that data. Creating polymr indexes _______________________ Let's start by opening and indexing the sample data, storing it in a LevelDB backend. .. doctest:: >>> import polymr >>> be = polymr.storage.LevelDBBackend('data/ma_senators.polymr') >>> with open('data/ma_senators.csv') as f: ... records = polymr.record.from_csv( ... f, ... searched_fields_idxs=[0,1], ... pk_field_idx=3 ... ) ... polymr.index.create(records, 1, 10, be) ... >>> be.get_rowcount() 38 Searching _________ Now that we have a backend populated with an index, we can create an Index object and run some searches. .. doctest:: >>> import polymr >>> be = polymr.storage.LevelDBBackend('data/ma_senators.polymr') >>> index = polymr.query.Index(be) >>> index.search(['', 'oconnor']) [{'fields': ['Patrick', "O'Connor"], 'pk': '520', 'score': 0.7777777777777778, 'data': [b'Republican', b'617-722-1646', b'Patrick.OConnor@masenate.gov'], 'rownum': 26}, {'fields': ['Kathleen', "O'Connor Ives"], 'pk': '215', 'score': 0.8571428571428572, 'data': [b'Democrat', b'617-722-1604', b'Kathleen.OConnorIves@masenate.gov'], 'rownum': 27}, {'fields': ['Sonia', 'Chang-Diaz'], 'pk': '111', 'score': 1.0, 'data': [b'Democrat', b'617-722-1673', b'Sonia.Chang-Diaz@masenate.gov'], 'rownum': 5}] Incremental indexing ____________________ Besides the batch method shown above, records can be added to the index incrementally. .. doctest:: >>> import polymr >>> be = polymr.storage.LevelDBBackend('data/ma_senators.polymr') >>> index = polymr.query.Index(be) >>> rec = polymr.record.Record( ... ['Sarah', "Connor"], ... '911', ... [b'Resistance', b'617-575-1300', b'Sarah.Connor@masenate.gov'] ... ) >>> index.add([rec]) [39] >>> index.search(['sarah', 'onno']) [{'fields': ['Sarah', 'Connor'], 'pk': '911', 'score': 0.4, 'data': [b'Resistance', b'617-575-1300', b'Sarah.Connor@masenate.gov'], 'rownum': 39}, {'fields': ['Patrick', "O'Connor"], 'pk': '520', 'score': 0.7857142857142857, 'data': [b'Republican', b'617-722-1646', b'Patrick.OConnor@masenate.gov'], 'rownum': 26}, {'fields': ['Kathleen', "O'Connor Ives"], 'pk': '215', 'score': 0.875, 'data': [b'Democrat', b'617-722-1604', b'Kathleen.OConnorIves@masenate.gov'], 'rownum': 27}, {'fields': ['Karen', 'Spilka'], 'pk': '212', 'score': 0.9285714285714286, 'data': [b'Democrat', b'617-722-1640', b'Karen.Spilka@masenate.gov'], 'rownum': 33}] Using the command line interface ################################ Polymr ships with a command line interface to searches and indexes. The ``polymr`` executable is installed along with the rest of the polymr module during `install`_. To see the invocation instructions, available options, and subcommands, try ``polymr --help``. Creating polymr indexes with ``polymr index`` _____________________________________________ The ``index`` subcommand creates polymr indexes. Creating an index with ``polymr index`` involves describing where you want the new index to be created, and feeding a delimited file of records into the executable. Use ``polymr index --help`` to see invocation instructions and available options. Let's index some sample data. The source code repository contains the list contact information of senators serving in the 190th general court of the commonwealth of Massachussetts:: $ cd data $ head -n3 ma_senators.csv Michael,Barrett,Democrat,416,617-722-1572,Mike.Barrett@masenate.gov Joseph,Boncore,Democrat,112,617-722-1634,Joseph.Boncore@masenate.gov Michael,Brady,Democrat,519,617-722-1200,Michael.Brady@masenate.gov The ``ma_senators.csv`` file is a CSV containing the first name, last name, party affiliation, room number, phone number, and email address of all senate members. To index these entries, with the primary key set to the senator's room number and the search fields set to the senator's first name and last name, we can use:: $ polymr index \ > -b leveldb://localhost/$PWD/ma_senators.polymr \ > --primary-key 3 \ > --search-idxs 0,1 \ > < ma_senators.csv This creates a polymr index named ``ma_senators.polymr`` in the current directory using the LevelDB backend. Searching polymr indexes with ``polymr query`` ______________________________________________ The ``query`` subcommand searches through a polymr index to find the records most similar to a query. Queries are terms similar to the search fields on the records you're looking for. A query should contain the same number of elements as the index has search fields. For example, if a set of records were indexed with two search fields, queries should be composed of two elements, where the first element searches through the first search field, and the second element searches through the second search field. Let's search through the index of senators created in the previous section, trying to find all senators with a last name resembling 'oconnor':: $ polymr query -b leveldb://localhost/$PWD/ma_senators.polymr '' 'oconnor' [ { "fields": [ "Patrick", "O'Connor" ], "pk": "520", "score": 0.7777777777777778, "data": [ "Republican", "617-722-1646", "Patrick.OConnor@masenate.gov" ], "rownum": 26 }, { "fields": [ "Kathleen", "O'Connor Ives" ], "pk": "215", "score": 0.8571428571428572, "data": [ "Democrat", "617-722-1604", "Kathleen.OConnorIves@masenate.gov" ], "rownum": 27 }, { "fields": [ "Sonia", "Chang-Diaz" ], "pk": "111", "score": 1.0, "data": [ "Democrat", "617-722-1673", "Sonia.Chang-Diaz@masenate.gov" ], "rownum": 5 } ] We find that there are two representatives with last names resembling 'oconnor': a democrat and a republican. As always, consult ``polymr query --help`` for invocation instructions and available options.