This product allows the generation of a flexible package of floating licenses for John Snow Labs Natural Language Processing Python libraries. At subscription time users can specify the number of licenses they need for each library – Healthcare NLP, Visual NLP, Finance NLP, and Legal NLP and the duration of the subscription. When the product is deployed on an EC2 machine, a license bundle is automatically generated and loaded in the Jupyter Lab environment. Users can right away start annotating their text documents with pretrained DL models, rules, and prompts, or train and tune new models. For facilitating quick experiments, we included a set of ready-to-use notebooks in the product, available in the Jupyter Lab environment.
The license package can be used on the AWS servers where this product is deployed or can be downloaded and used locally (e.g. local machines or colab environments). The license bundle can be shared by a team of data scientists or used inside the NLP Lab product to unlock Visual, Healthcare, Finance, or Legal features, according to the bundle configuration.
There is no limit on the number of documents, models, or pipelines that can be used with this subscription. However, the number of parallel pipelines/jobs that can be run is limited to the number of floating licenses selected at subscription time.
What is included
Spark NLP, the most widely used NLP library in the Enterprise. It provides production-grade, scalable, and trainable versions of the latest research in natural language processing and access to 700+ Embeddings, and 15,000+ pretrained models and pipelines covering tasks such as Entity Recognition, Information Extraction, Spelling and Grammar, Text Classification, Translation, Summarization, Question Answering or Emotion Detection.
Healthcare NLP software and models, enabling clinical and biomedical Named Entity Recognition for 400+ entity types, assertion status detection (identify between positive, negative, possible, past, and future facts), clinical relation extraction, clinical entity resolution to SNOMED-CT, ICD-10, CPT, RxNorm, LOINC, NDC, ICD-I, MeSH, UMLS.
Finance NLP software and models, enabling financial text classification, financial sentiment analysis, financial Named Entity Recognition (e.g. organizations, products, revenue, profit, losses, trading symbols, etc.), Entity-linking for normalizing NER entities and linking them to databases such as Edgar, Crunchbase, and Nasdaq, Assertion Status for inferring temporality and Relation Extraction financial De-identification and more.
Legal NLP software and models, covering Named Entity Recognition, Information Extraction on clauses, Legal Clause Classification, Legal Relationship Extraction, Entity Linking, Legal De-identification Assertion Status, and Relation Extraction. It includes access to over 300+ new state-of-the-art models available in multiple languages.
Visual NLP (OCR) software and models, enabling form understanding, table detection and extraction, noisy image enhancement, visual document classification, visual entity recognition, DICOM to text, signature detection, and image de-identification.
Access to models and pipelines published on the NLP Models Hub (currently 17,900+ and counting).
30+ Ready-to-use Jupyter notebooks that will help you get started with text and image analysis on all major NLP tasks.
Who is this offer for
Teams of python developers that need to extract entities and relations from text, image, and pdf documents;
Data scientists who deal with NLP problems;
Machine learning engineers who need to test/train/tune NLP models;
Scientific researcher groups who need to extract meaning from unstructured, natural language documents;
And anyone else interested in text and image analysis, image digitization, data extraction, document labeling and/or NLP model training.
Technical Specifications
Recommended memory: 32GB RAM
Recommended vCPU:8 vCPUs
Operating System:Ubuntu 20.04
Included integrations
Jupyter Lab is preinstalled and running on port 5000. Password: INSTANCE_ID