Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
GRAAL-Research
GitHub Repository: GRAAL-Research/deepparse
Path: blob/main/examples/parse_addresses_uri.py
1231 views
1
# pylint: skip-file
2
###################
3
"""
4
IMPORTANT:
5
THE EXAMPLE IN THIS FILE IS CURRENTLY NOT FUNCTIONAL
6
BECAUSE THE `download_from_public_repository` FUNCTION
7
NO LONGER EXISTS. WE HAD TO MAKE A QUICK RELEASE TO
8
REMEDIATE AN ISSUE IN OUR PREVIOUS STORAGE SOLUTION.
9
THIS WILL BE FIXED IN A FUTURE RELEASE.
10
11
IN THE MEAN TIME IF YOU NEED ANY CLARIFICATION
12
REGARDING THE PACKAGE PLEASE FEEL FREE TO OPEN AN ISSUE.
13
"""
14
from deepparse import download_from_public_repository
15
from deepparse.dataset_container import PickleDatasetContainer
16
from deepparse.parser import AddressParser
17
18
# Here is an example on how to parse multiple addresses using a URI model place in a S3 Bucket
19
# First, let's download the train and test data from the public repository.
20
saving_dir = "./data"
21
file_extension = "p"
22
test_dataset_name = "predict"
23
download_from_public_repository(test_dataset_name, saving_dir, file_extension=file_extension)
24
25
# Now let's load the dataset using one of our dataset container
26
addresses_to_parse = PickleDatasetContainer("./data/predict.p", is_training_container=False)
27
28
# We can sneak peek some addresses
29
print(addresses_to_parse[:2])
30
31
# Let's use the FastText model on a GPU
32
path_to_your_uri = "s3://<path_to_your_bucket>/fasttext.ckpt"
33
address_parser = AddressParser(model_type="fasttext", device=0, path_to_retrained_model=path_to_your_uri)
34
35
# We can now parse some addresses
36
parsed_addresses = address_parser(addresses_to_parse[0:300])
37
38