Tagging texts based on their topics is a manual, tedious and recurring work. In Graphext, we speed up this process so you can focus your efforts in your analysis.
With Text Category prediction flow, one can automatize this tagging process using previous categorized texts. Specifically, this flow can be helpful to categorize data from Tractor.
How do you get the predictions?
- First, you have to obtain your data from Tractor. You can find help here.
- Uploading the resulting file in Graphext, you can do many analysis. We are interested in the content Analysis flow. You can find it as a recipe, or using the wizard in the tweets or text sections. Once you have the project, use the filters and the graph to create your categories as a segmentation.
- Once you have defined your categories, go to the details section and export the full dataset as a csv file.
- Download a new dataset from Tractor. Then, add it with the dataset obtained in the third step in a zip file and upload it to Graphext.
- Using the wizard, select Text as the kind of data to work with, and Predict the category or topic of texts as analysis. Then, you will be asked a few questions. Indicate that your data is from Graphext, the column that contains the text and the column with the categories.
In the resulting project, you will have a column named GX PREDICTED CLASS, that indicates the predicted tag for every text. You can also find the probability of this prediction to be right in the GX PREDICTED CLASS PROBABILITY variable.