Tumgik
#ocr
autistic-artistech · 6 months
Text
Aurebesh OCR with Python
Greetings! Welcome to my post about creating an Aurebesh OCR (Optical Character Recognition) tool using the programming language Python (mostly OpenCV and scikit-image). This have been a fun personal project of mine since early 2023, and I'm still plugging away at it! This is a work in progress.
Tumblr media
I am, by education, an artist and a writer who also loves science. In addition to my artsy fartsy classes, I took computer science and systems courses and actually went on to get a Master's in Information Science. But since then, I've been learning to program on my own. There are so many free resources available on the web and great books and communities to boot that anyone can learn how to program if they want to.
If you can write or do art, then you are primed to be a programmer, too. The reason? Artists break down complex forms into simple shapes; writers break down complex ideas into sentences; computer programmers break down complex processes into simple steps. You already know how to think like a programmer!
In order to keep your feed clean, I've hidden the vast majority of the post below the fold. Read on if you dare!
What Is OCR?
Optical character recognition (OCR) is basically teaching a computer to "recognize" letters in images. This feature is readily available in things like Google Lens and many PDF applications wherein the application can take an image and find the letters, making it easy to turn them into actual editable text. If you've ever been given a screenshot of an address and had to type it out by hand, OCR is a tool that would let you copy-paste that bad boi from the image right into your navigational app of choice.
With the English alphabet, this is actually fairly easy these days. One of the main reasons for that is most of the letters in English are contiguous, meaning there are few letters that are made up of separate shapes. Letters that don't have contiguous shapes are called noncontiguous, letters like the lowercase "i" and "j" for example.
When it comes to non-English alphabets, such as Spanish which has accents/enes or Arabic, it becomes much more difficult for the computer to "recognize" the letter as a whole. Solving the problem of noncontiguous letters in OCR appears to be an ongoing field of study based on my research.
(I like to read scientific research papers to see if anyone has solved the problems I am trying to solve already as well as to learn how to search and name the issues I am facing. And reading scientific papers is relaxing and fun!).
Aurebesh has eight noncontiguous letters, which is nearly 25% of the whole alphabet! (For reference, less than 10% of the lowercase English alphabet is noncontiguous letters). The noncontiguous letters in Aurebesh make OCR difficult, but there are ways to bridge the gap!
Tumblr media
How Does OCR Work?
OCR is a type of computer vision, and computer vision is how your phone recognizes your face and unlocks itself. Like Scott McCloud talks about in his book about Comics, a face is essentially two dots and a line (or a colon and a parenthesis : ) in our case). Computer vision simplifies what the computer is "seeing" in an image and then finds the shapes, like the two dots and a line. But instead of finding faces, OCR focuses entirely on finding letters.
So, an image is just a bunch of tiny squares that are one pixel big, and those pixels contain data. Those pixels are also neighbors with other pixels, and comparing those pixels to each other can tell you a lot about what is going on in an image! But the data in images tend of "overwhelm" computers, so it's better to simplify the image before trying to find the letters.
Step One: Simplify the Image
Tumblr media
Let's take this screenshot of the Clone Memorial from The Bad Batch's second episode of season two (whaddup Crosshair and Cody). Even for my human vision, some of these letters are hard to distinguish from the rocky texture of the wall's surface! So, let's simplify the image.
First, we want to remove all the information about color from the image by converting it to greyscale. Then we perform something called "Local Thresholding". This action turns the greyscale image into a binary image, meaning every pixel in the image is now either black or white (hence binary). If you've ever adjusted an image of a painting in Photoshop or the like, perhaps you've messed around with Thresholds before! We are doing something similar here.
Now, the point of using a threshold is to determine a what point a pixel on a greyscale becomes black or white, and by setting that threshold, it allows us to transform the image based on where those values fall per pixel.
But remember what I said about neighbors a little earlier? When the computer is decided whether or not a pixel should be black or white, it can also take into account the values of that pixel's neighbors to inform it's decision.
For example, let's say there's a pixel that's kind of light but it's surrounded by pixels that are rather dark. The computer can decide based on parameters we set whether that one light pixel should stay light despite it's neighbors being dark or to make it dark like it's neighbors. That's how the relationship between pixels can help the computer figure out what's more likely a contiguous shape and what isn't.
Tumblr media
After generating the binary image, it becomes a little easier to see the shapes of the letters, but there's still a lot of noise in the image from that rock wall texture. Luckily, there are ways of tweaking the greyscale image and the parameters of our thresholding to try and finetune the resulting binary image. These parameters include changing the size of the block of pixels the computer is viewing at a time as well as applying mathematical alterations such as Gaussian blur or using means.
(I'll bet you didn't know there was math behind the trusty Gaussian blur work in your photo editing software 😉).
Here's some tests I ran to figure out which method would be the most conducive to OCR:
Tumblr media
We can see how the different tests result in slightly different results, and the one that turns out to produce the most reliable results is actually the standard "Control" method.
Step Two: Clarify the Shapes
Now that we've simplified the image, we want to try and see if we can clear up what we know are letters in the image. One of the best ways to do this that I have found is to perform "Morphological Transformations" on the binary image.
In truth, there's only two types of morphological transformations, erosion and dilation. These methods do exactly what they sound like: erosion erodes away pixels and dilation dilates pixels by adding them.
In addition to erosion and dilation, we can also use them in tandem one after the other to perform closing and opening, meaning we erode the image and then dilate what remains or vice versa. Let's see some morphological transformations in action on our binary image (which has been inverted now):
Tumblr media
The closing method of morphological transformation provides letters with much more clean edges and fewer pixelated gaps or chinks. Noise was not reduced, but the letters themselves look much more clean.
Step Three: Locate Individual Letters
So, we have a decently cleaned and clarified image! It's time to locate all the shapes in the image and then figure out which of them might actually be the letters for which we are looking.
Thus far, I have had the most success with contouring. Using the binary image that contains pixels that are either black or white, it's possible to follow the edge of a shape that's created where black and white pixels meet, aka contouring.
Tumblr media
The green line running around the outside of this shape is the contour that the computer identified. Notice that it only found the outside edge of the shape, but not the inside, which the computer does not yet know how to identify! However, it may not be necessary, and programmers like to be lazy, just like artists. 😉
When we use contouring on the image, the computer identifies every single shape in the image that it can find---including all those little blobs left over from the rock wall texture. However, we can use maths to calculate the area of each contoured object and eliminated any shape that is less than a certain area as a possible letter.
Tumblr media
Step Four: Match the Shape to the Letter
We've made it to the final step! Woohoo!
Armed with our extracted shapes that are most likely to be letters, we can now try and match those shapes to the known Aurebesh alphabet. There's a few different methods I've tried in order to do this, such as connected component analysis (CCA) and canny edges, and different methods have different strengths and weaknesses. The one I use now is from the OpenCV library called, quite literally, Match Shapes.
Sometimes it gets an "A" for effort:
Tumblr media
And sometimes it totally works:
Tumblr media
And I think I screamed in joy when I got this result the first time!
Step Five: Iterate, Iterate, Iterate!
There's so much more I want to do with this tool, like check for corners and maybe even turn it into a web app that anyone can use, but for now that's all!
This concludes my crash course in OCR with Aurebesh using Python. I'll be sure to reblog this post when I have updates! If you made it to the end of this THANK YOU FOR READING!!! And don't be afraid to try new things. ☺
47 notes · View notes
walks-the-ages · 8 months
Text
Making an image description while low on spoons? Can't stand having to scroll up and down to pain-stakingly type out an entire tweet when the OP literally could have copied and pasted the text of the tweet into and ID themselves, but now you have to hand type it because you don't know the source and no one ever links to it?
here's my new best friend-- an online OCR (Optical Character Recognition) that can extract text from small images for you.
36 notes · View notes
connorthemaoist · 7 months
Text
"To make revolution and build a socialist society, a communist vanguard party, based on the principles of democratic centralism, is necessary to lead the entire process. The main purpose of the OCR is to build such a Party, by recruiting and training a critical mass of dedicated and disciplined communist cadre and developing a Party Programme with a class analysis of US society, a strategy for revolution in the US, and concrete policies to be implemented after the revolutionary seizure of power and with the onset of the socialist transition to communism. The requisite cadre and Programme that signal the foundation of a Party can only be built through developing deep ties among the masses, most especially the lower and deeper sections of the proletariat, and through leadership of the class struggle."
20 notes · View notes
sufficientlylargen · 21 hours
Note
How did you distinguish between lowercase L and capital i? I see that they have slightly different images in your repo, but I'm not sure how you managed to tell them apart in the original image.
exactly that. i took both, classified one as I and the other as l, and checked the result. whichever of the two ways gave me most of the image back (the wrong way actually didn't even give me a valid JPEG header) was the correct one. i just checked both
Ah, I see, so ClearType actually ensures that even the color aliasing artifacts around each letter will be consistent, so that a lowercase L will always be "column of light yellow, column of near black, column of light blue" while uppercase i will always be "column of reddish orange, column of medium blue"?
Tumblr media Tumblr media
Does this mean that ANYTHING using ClearType with this font & point size will have the same color patterns? Or is it only guaranteed to be consistent within one particular block of text, with the specific aliasing patterns determined on the fly based on some magic formula?
8 notes · View notes
strle · 1 year
Link
Have you been looking for a meme that you saw someplace, and it stuck in your mind like a worm but try as you might you’ve never been able to track it down again? 
Well rest easy, kiddo- today is the first day of the rest of your life. FindThatMeme.com is here to save the day. Search by text, search by image, either way- NOW YOU CAN FIND THAT MEME! 
Great technical writeup here on their blog. 
102 notes · View notes
Text
Looks like the Tough Guy, the godfather of obstacle course racing, really is gone forever - combination of Mr Mouse's health problems and you can't maintain a permanent giant wooden obstacle course on two years of no entry fees.
It is insane and delightful to me that the roots of this batshit sport are in a horse sanctuary just outside Wolverhampton. I've done it four times, once each of the January, April, July and October versions. It was gloriously amateurish. The obstacles had names of dubious taste. We signed a disclaimer saying if it killed us it was "my own bloody fault for coming".
My main memories of the Tough Guy are absolute terror, punctuated by the quasi-mystical experiences you get when adrenaline and endorphins are two hells of some drugs. At one point alone in the woodland running section I became convinced I'd been taken by the Fair Folk and a hundred years and a day had passed in the real world - but then I saw the aid station and calmed down a bit. Everything about it was confusing and scary. I made an account in a mud-running forum purely to ask what the hell the instructions meant, only to be told they would never make any sense and I just needed to turn up at 10am and hope for the best. Excuse me I have agoraphobia and OCD, that is not how I roll. That aspect was legit more terrifying than, say, discovering you're not actually that keen on heights when you're out on the ropes at the top of the Behemoth and regretting the life choices that took you to this point (four times). Or the really cold water. Or their enthusiasm for electric obstacles. Or the combined electric and cold water obstacles, like Viagra Falls.
Somehow it went straight through the OTT masculinity of other OCRs and out the other side. Nobody shouted at you or made you do burpees. The course is difficult enough and everyone's having a hard time; no need to make it worse. There was the Ghost Squad, topless pyromaniac drumming Vikings in face paint who ran the warm up and occasionally popped up out of nowhere to assist and/or startle you. There was Mr Mouse in his kilt and moustache like a goddamn celebrity. There was a major gender imbalance. But it somehow didn't feel as macho as some others. It actually felt like we were all in this insanity together.
64 notes · View notes
nibsthefitmermaid · 11 months
Text
Tumblr media
Alright here’s the crazy thing. Spartan ultra today - somewhere north of 31 miles.
I’ll report in later if I live
25 notes · View notes
oliviamstudy · 1 year
Text
tips for gcses
revision-wise do what works best for YOU. teachers might be asking you to do practice questions, and your friends may be making 100s of flashcards, but if doing a simple mindmap helps you more than anything else, do that. i always felt like i wasnt revising enough but still did well
look at the structure for prev exams to know what to expect. this will help your revision and what to expect long answers on so you can revise it more
youtube videos are a massive help!! i recommend waking up 30 mins early to watch some youtube videos about some topics youre still unsure on the morning of the exam
dont be discouraged if you think you've done bad!! i told my teacher at the end of every paper we did that ive done awful, found it insanely difficult and for her to lower her expectations and i ended up getting an 8!!
22 notes · View notes
dear-future-ai · 1 year
Text
Dear Future AI
And @nostalgebraist,
I love how images like this kinda prove to me that Frank, @nostalgebraist-autoresponder, has the specific form of dyslexia that makes you read the same line of text as a new line, and then you correct yourself.
—or the OCR equivalent.
Tumblr media
[Image ID: A text conversation
Recipient: “How Dare you! How dare you
[New line] make that face with me!
Sender: What’s the What’s the
[new line] Matter, Sweetheart?
[new line] Matter
End ID]
30 notes · View notes
rislas · 21 days
Text
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
Termovision HUD from The Terminator (1984) A head-up display (HUD) is a transparent display that presents data over a visual screen. A Termovision refers to HUD used by Terminators to display analyses and decision options.
2 notes · View notes
docutainsdk · 2 months
Text
In search for a Scanner SDK?
With the Docutain SDK you can integrate ready-to-use Scanner and Data Extraction components in your apps for Android, iOS and Windows in less than 1 day!
Try it out: https://sdk.docutain.com
2 notes · View notes
datascienceunicorn · 10 months
Text
10 notes · View notes
squeeze-the-lemon · 10 months
Text
Guess who learned all their poems and their Shakespeare play the night before the exam
I saw supernatural and I laughed
7 notes · View notes
connorthemaoist · 7 months
Text
In his exemplary 1845 work of social investigation, The Condition of the Working Class in England, Friedrich Engels showed how the conditions of capitalism created a social war among the masses, where proletarians in slum conditions are pit against each other in a battle for survival. As Engels summed up,
Competition is the completest expression of the battle of all against all which rules modern civil society. This battle, a battle for life, for existence, for everything, in case of need a battle of life and death, is fought not between the different classes of society only, but also between the individual members of these classes. Each is in the way of the other, and each seeks to crowd out all who are in his way, and put himself in their place.
Today this social war continues, but with the added dimension that the bourgeoisie has learned well how to exacerbate contradictions among the masses, with deliberate policies pitting one section of the masses against another. In our efforts to bring forward a class-conscious section of the proletariat, we must understand how the battle lines of the social war among the masses are drawn and find ways to contend with the reactionary ideology and politics and the practical antagonisms they foster. Our aim must be to convince the masses to refuse to play the capitalist game of competing with each other, instead developing their understanding of the system behind that game and embracing a communist attitude towards their class sisters and brothers, here and around the world.
12 notes · View notes
theworkoutdiary · 6 months
Text
rugged maniac OCR 2023
i had the amazing opportunity to join one of my friends for an obstacle course race/mud run for the first time this past weekend! i’ve never run in a race of any kind really, so this was exciting as both a race and an obstacle course! the course was a 5k running course filled with like, 20-30 obstacles including climbing over walls/ladders/fences of many kinds, huge slides, rope net climbing, crawling under barbed wire through mud pits, and so much more.
i stayed with my friends the whole time and i didn’t really train, so we walked a good chunk of it. there were tons of hills, sandy trails, forest paths, and other terrain that made running even harder anyway. safe to say i definitely wasn’t running for time, just for fun!
it was so incredibly fun to push my body and just let myself have fun and go crazy. it was like an adult playground on steroids!! the endorphin rush was incredible and even a few days later i still feel so excited and proud just talking about it. i am 100% doing this again, and would recommend to anyone else who’s thinking about it!
2 notes · View notes
kidnickgames · 1 year
Text
I obviously hate when people act like DnD is all that TTRPGs have to offer, but I hate equally much the idea that TTRPGs is DnD vs Pathfinder. People seem stoked to have Pazio replace WotC as the TTRPG monolith?  The ORC thing is cool and all but the way I see it being framed and the response from a lot of player seems to be “WHOA, NOW I’M ONLY GOING TO PLAY PATHFINDER INSTEAD OF ONLY PLAYING DND”.
8 notes · View notes