This is a diacritization model for Arabic language. This model was built/trained using the Tashkeela: the Arabic diacritization corpus on Kaggle
-
Updated
Sep 10, 2023 - Python
This is a diacritization model for Arabic language. This model was built/trained using the Tashkeela: the Arabic diacritization corpus on Kaggle
Official code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING-2020).
A structured documentation hub for AI and ML concepts, based on Andrej Karpathy's 'Zero to Hero' series, featuring practical implementations and learning resources for language models and transformers.
Lyrics Generation:notes: using LSTM , word2vec Analysis and more
Text Article generator using using Character level LSTM network.
Build a character level language model to generate new dinosaur names
Sequence Models coding assignments
A causal intervention framework to learn robust and interpretable character representations inside subword-based language models
In this project, I worked with a small corpus consisting of simple sentences. I tokenized the words using n-grams from the NLTK library and performed word-level and character-level one-hot encoding. Additionally, I utilized the Keras Tokenizer to tokenize the sentences and implemented word embedding using the Embedding layer. For sentiment analysis
An implementation of "Character-level Convolutional Networks for Text Classification" in Tensorflow. See https://arxiv.org/pdf/1509.01626.pdf.
Notebooks of programming assignments of Sequence Models course of deeplearning.ai on coursera in May-2020
It aims to write new sentences by learning character units sentences using RNN. As training data, a collection of Shakespeare's novels was used.
This repository contains the code and PLODv2 dataset to train character-level language models (CLM) for abbreviation and long-form detection released with our LREC-COLING 2024 publication
Name generation using RNN. This model was trained for generating indian names. Made using keras.
retro style tokenization for language models
Character-level and token-based language models implemented in pure PyTorch.
This repository contains the source code for our research on character-level models in Arabic Natural Language Processing (NLP).
Recurrent neural network for building a character-level language model and its application to generating new dinosaur names
Add a description, image, and links to the character-level-language-model topic page so that developers can more easily learn about it.
To associate your repository with the character-level-language-model topic, visit your repo's landing page and select "manage topics."