Discover Knowledge Functions

Knowledge Functions boost analytical productivity by scaling human expertise and automating complex analytical workflows. Knowledge Functions capture and scale tacit domain knowledge, enabling machines to read, learn, and understand in specialized ways.

Underlying Accrete's Knowledge Functions are models that simulate memory across varying timescales. These models continuously learn by dynamically recalibrating neural architectures on the fly with minimal human feedback. The result is human-level accuracy, minimal training, tuning, and low maintenance costs.

Contextual Relevance

Assess how important a piece of data is in context to a specific domain just like a human expert. Using the next generation of natural language processing and the latest developments in deep learning, the machine is able to understand the meaning of a piece of information within a specific area of expertise i.e. finance, real estate, marketing, etc. Articles can now be intelligently appraised and a score can be set to determine how important and relevant any piece of information is. Categorize and distill domain-specific information with artificial intelligence, just like a human expert.

What Makes This Powerful
  • Agile learning - The model is built using a semi-supervised approach and is very agile in understanding and operating within different domains
  • Variable relevance - Users can configure the model with a numerical score and have the freedom to choose their own threshold to differentiate between relevant and irrelevant information
  • Adaptable scoring - The scoring can be used both to extract important, as well as, ‘potentially important pieces of text.
  • Continuous Dynamic Learning - The model intelligently learns and adapts to user-specific relevant texts, which is dynamically adjusted for within the knowledge function providing ever-increasing value
How This Works

The power and advantages of building a model using a semi-supervised approach is the agility it provides for the model to be able to learn new domains quickly. The model in such cases only learns the specific relevant texts, while everything else is classified as irrelevant.

We combine a plethora of proprietary machine learning and deep-learning models to produce a score that signifies whether or not the content being evaluated is relevant.

The models work in tandem, in a nested fashion to improve performance and pass context from one step to another. Finally, as the model learns new topics/domains, this semi-supervised approach allows it to develop branches where each branch specializes in one domain and thus allows the model to learn in a continuous and dynamic manner.

Contextual Sentiment

Understand and contextualize the sentiment of a piece of text just like a human. Using an ensemble of hybrid deep-learning models, comprising of transformer-based architectures, overplayed with sequential and feed-forward neural networks the contextual sentiment knowledge function can classify sentiment with 96% accuracy. Near-human like preciseness is achieved using a new class of models that are superior to old methods reliant upon positive/negative word counts. This knowledge function can understand and interpret the relevant context, with respect to the domain of the text using language understanding and can contextually categorize text like a human.

Available in 2 languages: English, and Simplified Chinese. Can be translated into multiple languages using Accrete’s proprietary models.

What Makes This Powerful
  • Agile learning - Continuous stream - The function provides a probability-measure for the sentiment to account for how-positive / negative the text is as a continuous stream, rather than as a bag of words approach.
  • Quantifying magnitude - Humans, in general, are good at qualifying / classifying statements to different classes but not necessarily at quantifying the magnitude. This function, following this principle, is trained as a classifier, and hence achieves much better performance than its competitors.
  • Continuous Learning - The hybrid ensemble approach is utilized to ensure the performance is maximized with respect to Bayes Human Error. As with every other function on our platform, this function also has the ability to continually learn from new data while mitigating catastrophic forgetting.
How This Works

The contextual sentiment function is built on the foundations of how humans perceive language, as well as how we, as humans classify something in different categories. We are good at categorizing, i.e. we would be in agreement (on average) about whether something is positive, negative, neutral, etc., but if asked to provide a quantitative score we would generally not be on the same page. This is the underlying principle of how we have built this function. This ensemble of hybrid deep-learning models is first trained as a classifier for sentiment, rather than as a quantifier.

Then, the probabilistic measure of this ensemble is fine-tuned to further improve the value and performance of the function with respect to different use cases, e.g. trading in financial applications, a measure of side-effects in pharmaceuticals, etc. These models are built to ensure they achieve maximal performance with respect to the Bayes Human Error metric and provide scalability to problems that would be impossible for humans to comprehend.

Intelligent Web Discovery

Conduct in-depth research and stay up-to-date with a specific area of interest just like a human. Using the latest in situationally aware machine-learning techniques, this knowledge function can keep track and discover relevant sources of information; articles, blogs, research, financial documents, tweets, etc. all in real-time. By configuring the tool to focus on a specialized area of expertise it can continuously uncover relevant information and provide you with a domain-focused data feed as accurately as a human.

What Makes This Powerful
  • No Down-Time - Unlike a human researcher, a machine is always switched on. With 99.999% uptime the A.I continuously scours for relevant information.
  • Autonomously discovery - Once seeded with the relevant search keywords the intelligent web discovery knowledge function repeatedly discovers relevant sources of information.
  • Self-governing awareness - Once the knowledge function is seeded the machine will intelligently distill what information is relevant. Over time the tool becomes more intelligent and delivers greater value.
  • Smart refinement - The knowledge function keeps refining the search based on the outputs and user-generated feedback. Over time the tool quickly becomes smarter and the quality of the information surfaced improves massively.
How This Works

The application starts by converting the keywords into a machine-processable query format. This query is then used by the model to look for relevant information across the web by crawling direct and nested links based on the user-defined keywords. The model is tuned to discover new sources and provide relevant information continuously as the model runs the query continuously in a loop looking for new information in real-time.

All unique new relevant information is returned by the function continuously. The model also uses an in-built noise function which learns continuously to reduce non-relevant outputs. The query can be refined further by adding or modifying the keywords based on the results returned by the function.

Source Reliability

Know how reliable an individual source of information is without bias and better than a human. Across the internet, there are millions of sources of information, some popular and many lesser-known. However, even if a source has low notoriety, it does not mean it is any less reliable. By modeling numerous factors including recency and propagation, Accrete has developed a dynamic scoring algorithm for scoring the reliability of an individual source, independent of its popularity. Realizing reliable sources are publishing highly accurate information with greater consistency and far in advance of market leaders opens a gateway to an untapped resource of ‘superstars’. Reliability scoring enables you to empirically understand how reliable a source of information is across time, decoupled from biases, and better than a human.

What Makes This Powerful
  • Bias-Free Analysis - Factors like popularity skew the truth regarding the quality of information a source is sharing. With source reliability, these elements are eliminated which levels the playing field.
  • Dynamic Scoring - A source's reliability will vary through time. By building an algorithm that accounts for this variation the accuracy of the score is far superior.
  • Informed Decisions - By de-coupling bias and empirically measuring a source's reliability, end-users can make decisions based on the value of the content and get ahead of the competition.
How This Works

The reliability of a source can be explained by two factors

  1. Recency: The elapsed time between specific information catered to a particular target, provides us with an important metric on the reliability of the sources. If a ”high-reliability” source takes a long time to make a statement as compared to a ”low-reliability” source, this implies that the informational-α has already been disseminated by the time this happens and therefore is of no use.
  2. Propagation Count: The content’s informational reliability, which in turn reflects the source’s reliability, provides us with a second metric to dynamically change a source’s reliability scoring. In other words, if a piece of initial information is followed by many such pieces later, there is a high probability that this source had accurate information, as opposed to content that goes away quickly.

While Recency measures how fast some information is propagating, Propagation Count measures how many sources have published the same content. A combination of the above concepts gives a way to understand the source characteristics in a dynamic fashion that captures the informational-α with maximal probability.

Source Popularity

Understand the popularity of a source of information better than a human, without bias. This function uses a variety of parameters and a computational model that accounts for the average amount of traffic, individual bounce rate, and average time spent on the source. The popularity score is made more accurate by inputting data that is specific to the source only, factors including the relevancy of the source to the news, social popularity, and influence. By blending all of these data points the individual popularity of a source can be quantified giving you complete transparency on a source’s popularity.

What Makes This Powerful
  • Continuous accurate assessment - The model considers a wide variety of factors that are dynamically adjusted as part of the feedback process which refines and increases the accuracy of the algorithm.
  • Regression modeling - Various regression models are assigned to the various factors which are continuously refined by the Accrete development team improving the scoring over time.
  • Secure and reliable - Accrete hosts our knowledge functions using global computing infrastructure that is reliable, scalable and secure. Service redundancy is guaranteed with 99.99% uptime.
How This Works

The model looks at different factors to identify the source popularity based on the source type. A social media source’s popularity is calculated on the basis of the number of followers, number of shares, number of likes, number of comments, and nested count of the reach of the post. Other source popularity is based on the type of source, Alexa rank of the source, number of visits, content type, and comments. The model keeps refining the weights for different factors of the model to keep improving the popularity score using a continuous learning model based on user feedback.

Topic Classification

To be able to classify topics accurately you need to understand language like a human. Using a deep-learning mathematical framework each text snippet can be classified into a topic with 93% accuracy. This framework comprises the latest advancements in natural language processing and understands the contextual meaning of the individual sentence to its adjacent sentences and its placement within the entire paragraph/document. Accrete has developed AI that can understand the semantic and contextual nature of an individual sentence within its surrounding environment and specific to the domain of the text itself using language understanding, the topic classification can provide a deeper level of understanding as meticulously as a human at scale.

Available in 2 languages: English and Simplified Chinese. Can be translated into multiple languages using Accrete’s proprietary models.

What Makes This Powerful
  • Language has multi-scaled representation - The tool is able to look at words that are both near and far away from each other in terms of their relationships massively increasing the topic classification accuracy.
  • Builds memory across the board - The model has been trained to extract context from sentences, near and far from within the document. Therefore accounting for changing context and nuance within the text.
  • Expandable topics - Users can generate their own list of topics and add them to the ones surfaced by the knowledge function. Increasing the accuracy and value of the AI
  • Continuous dynamic learning - The model keeps growing in size and gets smarter as it encounters new data which ultimately delivers more value over time
  • Rapid contextual learning - With a very limited training dataset the AI is able to rapidly learn the context of a document. The model achieves 93% accuracy with respect to 'Bayes Error'.
How This Works

The model starts by converting the text into a multi-scale numerical representation. This representation allows the model to build a multi-scale memory from the text and thus is able to extract the most relevant information related to the task. These numerical representations are then passed on to a custom architecture based deep-learning network module.

We have been conducting extensive research to build a continuous dynamic learning network that can learn from the feedback given by a user, without going into catastrophic forgetting, as is usually the case with online learning for deep-learning networks. Our network has the ability to semi-autonomously grow in size, to alleviate this issue as well as become smarter as it looks at more data, just like a child becomes an educated adult.

Viral Influence

Knowing how influential a user and their content is very difficult to quantify. Every social media platform Twitter, YouTube, TikTok, etc. employ similar models whereby users are encouraged to post information and follow other users. This generates countless interactions between users through likes, comments, reposts, shares, and various other actions. To quantify the value of interactions, Accrete has developed a model that counts interactions and calculates the popularity of interacting users, followers, followers-of-followers, and so on. By looking beyond first degree interactions Accrete can model the sphere of influence and quantify the virality of the individual users.

What Makes This Powerful
  • Real-time Assessment - Don't rely upon quantitative values; the number of likes, shares or follower counts. Compute a user's viral influence in real-time.
  • Regression modeling - Various regression models are assigned to the various factors which are continuously refined by the Accrete development team improving the scoring over time.
  • Dynamic Scoring - A user's viral influence score is continuously recalculating as new interactions and activity takes place. By building an algorithm that accounts for variation the accuracy of the score is superior.
  • Informed Decisions - De-couple bias and empirically measure a user's viral influence. Make decisions based on the true value of the user and increase your Return-On-Investment.
  • Zero Down-Time - Unlike a human researcher, a machine is always switched on. With 99.999% uptime the A.I continuously scours for relevant information.
How This Works

The viral influence of a user is calculated based on ‘impact’ factors, this is derived from the qualitative engagement of their own content. For example, the viral influence score for an artist on a music-centric platform calculates the impact of their tracks upon immediate and nested user networks. To calculate the impact of user-generated content the algorithm considers factors such as likes, reposts, shares, comments or other pertinent actions. A network tree is constructed for each piece of content that considers user's activity i.e. likes, downloads, etc.

Within a network tree, the root node represents the user and the child nodes are interacting users. Once this network tree is constructed, using Accrete’s Source Popularity knowledge function these interactions are scaled to calculate the impact. The viral influence score of the user is the average impact of all of the user-generated content.

Authentic Engagement

With millions of users sharing information online, it has become imperative for analysts to understand the authenticity of each user. With the rise of troll farms and the mass automation of bots, high-levels of engagement can be manufactured. These disingenuous tools are used by organizations to spread misinformation and by people to unfairly garner fame. To account for these dynamics, Accrete's team has implemented a machine learning algorithm for Natural Language Processing that reads through comments in a semantic way. It repeats this process for each connected user up to 2 degrees away. By including the network of followers it filters out the bots and increases the accuracy of the authentic engagement score.

What Makes This Powerful
  • Real-time Assessment - Don't rely upon quantitative values; number of likes, shares or follower counts. Compute the true authenticity of any online user profile in real-time.
  • Target Real Users - Filter out bots, trolls, and self-promotion. Engage with genuine users with a high-value online presence. Do this in a more nuanced way and increase your Return-on-investment.
  • Dynamic Scoring - A source's authentic engagement value will vary depending upon recent activity through time. By building an algorithm that accounts for this variation the accuracy of the score is far superior.
  • Informed Decisions - De-couple bias and empirically measure a source's authenticity. Make decisions based on the true value of the user and get ahead of the competition.
How This Works

The machine uses a character-level based embedding learning model that converts each comment to a tensor (or an algebraic object). Character-level models are helpful in cases where the language may not have a specific linguistic structure such as in social media comments, and it can account for the use of emojis.

The model is trained in an unsupervised manner on a subset of all the data available. Hereafter, the embeddings are then passed through an unsupervised clustering algorithm, which outputs 3 clusters; (1) Valid comments, (2) Self-promotional comments, and (3) Junk comments. The comments for each user are analyzed and an aggregated score is computed using Accrete's proprietary algorithm.

Image Classification

The goal of image classification is to assign a specific label to an image. Here, we have developed an image classification model for the use case of grading collectible items.

Grading of collectible items such as baseball and other sports card, trading cards, Pokemon cards, coins, currencies, stamps, music and movie posters is currently done by human experts by examining the collectible item minutely to identify physical defects to the item such as incorrect centering of a baseball card, tear to a card or a poster, rusting of coins. This grading procedure can lead to inconsistency in the grade assigned to an item from one human expert to another. To reduce this inconsistency in grading and automate the process of grading, we have developed a deep learning-based computer vision algorithm that expedites grading and provides a consistent grading scheme over all items.

What Makes This Powerful
  • Real-time Assessment - The model is able to determine the grade of a collectible item in real-time
  • Long-tail distributed dataset - For most real-life datasets, the data often has a long-tail distribution i.e. high frequency classes contribute to majority of the dataset and low frequency classes are largely under-represented. The current algorithm is able to accurately model datasets with such an underlying distribution
  • Continuous dynamic learning - The model keeps on improving by incorporating feedback from human experts
  • Extendable - The model can be easily extended to any image classification use-case
How This Works

The steps involved in getting the final grade of a collectible item:

Step 1: Extraction of the region of interest

The first step in the grading process is to extract the region of interest (ROI) i.e. the card region from the image so that the grading model can only use the ROI to determine the grade of the card. Retraining of the model does not require more than 400 labeled images based on empirical studies conducted while doing this work

Step 2: Grading the region of interest

Once the ROI is extracted, the next step is to grade the extracted image. We have developed a custom deep learning model which extracts hidden features from the raw ROI image and uses a classifier network to determine the final grade. Along with the hidden-features extracted by the deep learning model, we can also feed features or sub-grades defined by human experts such as centering, corner quality, surface smoothness and edge quality to the grading model to improve the accuracy of the model and provide human-level performance. We incorporate a continuous dynamic learning framework so that the model can improve its performance over time based on feedback provided by the user.

Entity Normalization

A named entity can have many in-text name-variants, and we normalize them into one key-term (e.g. “United States”, “U.S.A”, “US”, “UNITED STATES”). When faced with a search result including multiple name-variants, Accrete’s Entity Normalization Knowledge Function merges them into their key-term and regard them as one (e.g their frequencies are added up to be the combined frequency for the key-term). When we refine a search result by choosing a key-term, we expand it to its name-variants to search for all of them. We keep normalized results as a dictionary with (key, values) = (key-term, a list of name-variants) for each entity type. We call each (key, value) pair an entry.

What Makes This Powerful
  • Efficient & Quick - Our solution is designed to work fast, be computationally efficient, and scale to data sets of any size. It also has built in quality control algorithms to ensure the accuracy of the result.
  • Multi-Dimensional - Our solution automatically find the best optimal matchings among variants with various matching algorithms and use them all to increase the quality of normalization
  • Capable of handling data streams - Our solution can handle data streams as well as static data. We provide tools to maintain/update normalization databases to adapt new inputs on a regular basis.
How This Works

Given N entity names, we want to find groups of multiple lengths referring to the same entities. The number of normalized entities and group lengths both vary and are unknown. Even at the simplest level, standard approaches begin by comparing N* (N-1)/2 possibilities for each name pair with time complexity N². At scale, this approach is untenable by conventional fuzzy matching processes.

We are using a computationally efficient algorithm for comparing similarities of word-vector representations as the first layer, and run the various matching algorithms for the pairs found. This method can drastically speed up the whole process, and get us multi-dimensional matching information, with which we use an ensemble method to get one comprehensive match score for each pairing. After some quality control, we use those pairs to generate clusters, which form our normalized name groups. By choosing a key-term as a representative for each name group completes the variant-level normalization process.

Similar processes could be done for group level and data-base level to enhance and maintain the quality of normalization.

Named Entity Recognition

Accrete’s Named Entity Recognition Knowledge Function is a supervised model that detects named entities in text, taking into account the surrounding context. This model detects the following entity types: Person, Organization, Location, Date, Time, Event, Law, Product, Facility, Percent, Quantity, Money, Geopolitical entity, and NORP.

What Makes This Powerful
  • Base models can extract up to 16 entity types - persons, organizations, locations, geopolitical entities, facilities, products, events, times, dates, money, national or religious groups, quantities, percentages, work of arts, languages, laws.
  • Up-to-date - models are continuously fine-tuned to learn from new data distributions with almost no labeling required.
  • Dynamic - models are able to learn to detect new entity types by providing very few (up to 10) labeled examples.
How This Works

The task is addressed as a span detection problem; for each token in the text, the model predicts its entity type, or if it is not an entity at all. Typically, neural networks model’s parameters are tuned by minimizing a Loss Function that measures how far the predicted named entities are from the original ones. Accrete’s model’s parameters are tuned by minimizing a Hybrid Loss Function that also takes into account the predicted part-of-speech (POS) tags from the original ones. The assumption is that to better identify a named entity, it is important to incorporate syntactic information describing its role in the sentence, which is encoded in its POS tag.

Entity Mapping

For two separate domains given with entity names from each, We may need to connect those names. Entity Mapping Algorithm finds a mapping from names in one domain to names in the other domain, based on their similarities in multi-dimensions. Currently, it is being used in Argus for its Federated Search: entities in Semantic Search <--> entities in Relational Search, and entities in Investment Activity <--> entities in Relational search

What Makes This Powerful
  • Connecting Bridge between Domains - Each domain provides information for each entity in its unique dimension, and it is often required to combine each information into a one multidimensional place. Entity Mapping is working as a connecting bridge between different domains, and allows users to move around them to find different kinds of information with minimal effort.
  • Efficient & Quick - Our solution is designed to work fast, be computationally efficient, and scale to data sets of any size. It also has built in quality control algorithms to ensure the accuracy of the result.
  • Capable of Handling Data Streams - Capable of Handling Data Streams: Our solution can handle data streams as well as static data. We provide tools to maintain/update mapping information databases to adapt new inputs on a regular basis.
How This Works

Given entities in domains, we can find mappings among similar entities from each domain, utilizing the processes in Entity Normalization. It finds name pairs close to each other, where each name in the pair is from different domains. Using those pairs, we generate a graph structure with one node from a domain and the other neighboring nodes are from the other domain. Populating every node from the name pairs provides mapping between the entities in two different domains. The end result can be one-to-many mappings in the case that domain sizes are different.

Semantic Search

Semantic search enables analysts to search content using filters that reveal the context of the returned results. In addition to returning a list of articles for the search query, Semantic Search also provides insight related to the context. This includes delivering most significant entities that appear in the returned list, categories associated with the results, geo heat map that shows the locations mentioned in the articles, Knowledge Graph that connects the extracted entities to the returned results, Semantic Activation Map to visualize the context of the returned results.

What Makes This Powerful
  • Corpus Enrichment - The documents are segmented. The segments are used to identify different contexts mentioned inside the documents.
  • Mapping the Corpus into a Semantic Space - In order to enable processing the articles are converted into a single machine readable format. The visual features and the design of the articles are lost but content for each section is preserved. The content of each article is enriched with entities, keywords, concepts and relations.
  • Indexing - The enriched document segments contain meta information about the original documents in addition to NLP enrichments such as entities, relations, semantic roles, concepts, categories, etc. All the enrichments are indexed by the search engine.
  • Dashboard and Visualization - The insights can be visualized via Knowledge Graphs, anomaly graphs, semantic maps, geo heat maps and geo distribution maps to navigate and hone in the insight most relevant for the user.
How This Works

Pre processing: Analysis of User Queries
The user queries are analyzed using Contextual Natural Language Processing. Keywords are extracted, normalized, expanded in semantic context, and the associated Semantic Fingerprints are generated.

Post Processing:  Extracting insight, building a Knowledge Graph
The returned results are processed to highlight relevant insights and create a knowledge graph. Knowledge Graph, built using the relationships between entities, documents and document segments, are a way to represent and extract hidden relations and insights.

Ready to Transcend Your Limits?

Get Started