Pull to refresh
87.89

Image processing *

Working with photos and videos

Show first
Rating limit
Level of difficulty

InvokeAI 2.2: UI Outpainting, Embedding Management and more

Reading time 2 min
Views 6K

InvokeAI 2.2 is now available to everyone. This update brings in exciting features, like UI Outpainting, Embedding Management and more. See highlighted updates below, or the full release notes for everything included in the release.

What’s new?
Total votes 5: ↑5 and ↓0 +5
Comments 2

I trained a neural network on my drawings and give the model for free (and teach you to create your own)

Reading time 2 min
Views 3.2K

Great for seamless patterns, abstract drawings, and watercolor-styled images. How to use it and train a neural network on your own pictures?

Download the model here: https://huggingface.co/netsvetaev/netsvetaev-free

I wanna know!
Total votes 6: ↑6 and ↓0 +6
Comments 0

Color image capturing device with pseudorandom patterns sets

Reading time 4 min
Views 686

The present invention relates to an analog signal capturing devices generally and monochrome or color image capture sensors, such as a scanner or a Charge-Coupled-Device (“CCD”) for video and photo camera in particular, which are almost free from moiré and aliasing. The present invention relates to methods for enhancing the resolution of an image capture device and device for digital color/grey image displaying also.

Read more
Total votes 1: ↑1 and ↓0 +1
Comments 1

ruDALL-E: Generating Images from Text. Facing down the biggest computational challenge in Russia

Reading time 11 min
Views 10K

Multimodality has led the pack in machine learning in 2021. Neural networks are wolfing down images, text, speech and music all at the same time.  OpenAI is, as usual, top dog, but as if in defiance of their name, they are in no hurry to share their models openly.  At the beginning of the year, the company presented the DALL-E neural network, which generates 256x256 pixel images in answer to a written request.  Descriptions of it can be found as articles on arXiv and examples on their blog.  

As soon as DALL-E flushed out of the bushes, Chinese researchers got on its tail.  Their open-source CogView neural network does the same trick of generating images from text.  But what about here in Russia? One might say that “investigate, master, and train” is our engineering motto.  Well, we caught the scent, and today we can say that we created from scratch a complete pipeline for generating images from descriptive textual input written in Russian.

In this article we present the ruDALL-E XL model, an open-source text-to-image transformer with 1.3 billion parameters as well as ruDALL-E XXL model, an text-to-image transformer with 12.0 billion parameters which is available in DataHub SberCloud, and several other satellite models.

Read more
Total votes 3: ↑3 and ↓0 +3
Comments 4

Mode on: Comparing the two best colorization AI's

Reading time 11 min
Views 3.2K

This article continues a series of notes about colorization. During today's experiment, we’ll be comparing a recent neural network with the good old Deoldify to gauge the rate at which the future is approaching.

This is a practical project, so we won’t pay extra attention to the underlying philosophy of the Transformer architecture. Besides, any attempt to explain the principles of its operation to a wide public in hand waving terms would become misguiding.

A lecturer: Mr. Petrov! How does a transformer work?
Petrov with a bass voice: Hum-m-m-m.


Google Colorizing Transformer vs Deoldify

Read more →
Total votes 17: ↑17 and ↓0 +17
Comments 0

Playing with Nvidia's New Ampere GPUs and Trying MIG

Reading time 11 min
Views 3.8K


Every time when the essential question arises, whether to upgrade the cards in the server room or not, I look through similar articles and watch such videos.


Channel with the aforementioned video is very underestimated, but the author does not deal with ML. In general, when analyzing comparisons of accelerators for ML, several things usually catch your eye:


  • The authors usually take into account only the "adequacy" for the market of new cards in the United States;
  • The ratings are far from the people and are made on very standard networks (which is probably good overall) without details;
  • The popular mantra to train more and more gigantic models makes its own adjustments to the comparison;

The answer to the question "which card is better?" is not rocket science: Cards of the 20* series didn't get much popularity, while the 1080 Ti from Avito (Russian craigslist) still are very attractive (and, oddly enough, don't get cheaper, probably for this reason).


All this is fine and dandy and the standard benchmarks are unlikely to lie too much, but recently I learned about the existence of Multi-Instance-GPU technology for A100 video cards and native support for TF32 for Ampere devices and I got the idea to share my experience of the real testing cards on the Ampere architecture (3090 and A100). In this short note, I will try to answer the questions:


  • Is the upgrade to Ampere worth it? (spoiler for the impatient — yes);
  • Are the A100 worth the money (spoiler — in general — no);
  • Are there any cases when the A100 is still interesting (spoiler — yes);
  • Is MIG technology useful (spoiler — yes, but for inference and for very specific cases for training);
Read more →
Total votes 5: ↑5 and ↓0 +5
Comments 0

Как с помощью HUAWEI ML Kit самостоятельно создать апплет для фото на документы

Reading time 5 min
Views 2K

Общая информация


В предыдущей статье мы рассказали о том, как создать камеру для улыбок с помощью HUAWEI ML Kit. В этот раз я собираюсь представить вам новую функцию HUAWEI ML Kit.

Вас когда-нибудь просили на учебе или работе принести фотографию определенного размера с цветным фоном для документов? В большинстве случаев у человека не окажется под рукой подходящей фотографии. Однажды в институте нам решили оформить персональные пропуска, но фотостудия оказалась закрыта. Тогда я сфотографировался на телефон, использовав простыню в качестве фона. И получил выговор от преподавателя. Но с помощью инструмента HUAWEI ML Kit вы сможете интегрировать SDK для сегментации изображений в ваше приложение и разработать апплет, чтобы создавать фото на документы самостоятельно и решить проблему отсутствия нужных фотографий.

Самое главное, что этот SDK абсолютно бесплатный и работает на всех телефонах на базе Android.

Разработка апплета для фото на документы самостоятельно


1. Подготовка


1.1 Добавьте репозиторий Maven Huawei в файл на уровне проекта build.gradle


Откройте файл build.gradle в корневом каталоге вашего проекта Android Studio.

image

Read more →
Total votes 2: ↑2 and ↓0 +2
Comments 0

Neural networks in reality

Reading time 2 min
Views 858
The mass of news and articles about artificial intelligence creates the illusion that we are living in a fantastic time. But when you start asking everyone what exactly is useful in real life from these high technologies, the answers come down to some Google features, mobile games and a story about Chinese videos. By the way, oh, these Chinese videos — for some reason, they are constantly shown by the central mass media when they demonstrate Moscow's intellectual technologies.

In words, it seems, all the «intellects» are installed already everywhere, the whole country has long been transferred to neural networks, but only in some kind of demonstration pictures, in diagrams, on fingers. There is a mental dissonance — why not take a video camera and shoot at least a fragment of how Russia's super mega technologies work?

As Nikita Sergeevich said, «science ceases to be self-indulgence when its fruits are applied in the national economy.» And today's artificial intelligence is familiar to us only from games. Many people really want to see something useful in reality. Therefore, we were not too lazy and recorded our video of the operation of neural networks from real objects.

Total votes 1: ↑0 and ↓1 -1
Comments 0

The color of the Moon and the Sun from space in terms of RGB and color temperature

Reading time 17 min
Views 3.2K
It would seem that the question of the color of the Moon and the Sun from space for modern science is so simple that in our century there should be no problem at all with the answer. We are talking about colors when observing precisely from space, since the atmosphere causes a color change due to Rayleigh light scattering. «Surely somewhere in the encyclopedia about this in detail, in numbers it has long been written,» you will say. Well, now try searching the Internet for information about it. Happened? Most likely no. The maximum that you will find is a couple of words about the fact that the Moon has a brownish tint, and the Sun is reddish. But you will not find information about whether these tints are visible to the human eye or not, especially the meanings of colors in RGB or at least color temperatures. But you will find a bunch of photos and videos where the Moon from space is absolutely gray, mostly in photos of the American Apollo program, and where the Sun from space is depicted white and even blue.

Especially my personal opinion is nothing but a consequence of the intervention of politics in science. After all, the colors of the Moon and the Sun from space directly relate to the flights of Americans to the Moon.

I searched through many scientific articles and books in search of information about the color of the Moon and the Sun from space. Fortunately, it turned out that even though they do not have a direct answer to RGB, there is complete information about the spectral density of the solar radiation and the reflectivity of the Moon across the spectrum. This is quite enough to get accurate colors in RGB values. You just need to carefully calculate what, in fact, I did. In this article I will share the results of calculations with you and, of course, I will tell you in detail about the calculations themselves. And you will see the Moon and the Sun from space in real colors!
Read more →
Total votes 4: ↑3 and ↓1 +2
Comments 0

How we made landmark recognition in Cloud Mail.ru, and why

Reading time 11 min
Views 2.4K


With the advent of mobile phones with high-quality cameras, we started making more and more pictures and videos of bright and memorable moments in our lives. Many of us have photo archives that extend back over decades and comprise thousands of pictures which makes them increasingly difficult to navigate through. Just remember how long it took to find a picture of interest just a few years ago.

One of Mail.ru Cloud’s objectives is to provide the handiest means for accessing and searching your own photo and video archives. For this purpose, we at Mail.ru Computer Vision Team have created and implemented systems for smart image processing: search by object, by scene, by face, etc. Another spectacular technology is landmark recognition. Today, I am going to tell you how we made this a reality using Deep Learning.
Read more →
Total votes 45: ↑44 and ↓1 +43
Comments 0

Automatic respiratory organ segmentation

Reading time 8 min
Views 2K

Manual lung segmentation takes about 10 minutes and it requires a certain skill to get the same high-quality result as with automatic segmentation. Automatic segmentation takes about 15 seconds.


I assumed that without a neural network it would be possible to get an accuracy of no more than 70%. I also assumed, that morphological operations are only the preparation of an image for more complex algorithms. But as a result of processing of those, although few, 40 samples of tomographic data on hand, the algorithm segmented the lungs without errors. Moreover, after testing in the first five cases, the algorithm didn’t change significantly and correctly worked on the other 35 studies without changing the settings.


Also, neural networks have a disadvantage — for their training we need hundreds of training samples of lungs, which need to be marked up manually.


Read more →
Total votes 11: ↑10 and ↓1 +9
Comments 1

AI-Based Photo Restoration

Reading time 7 min
Views 18K


Hi everybody! I’m a research engineer at the Mail.ru Group computer vision team. In this article, I’m going to tell a story of how we’ve created AI-based photo restoration project for old military photos. What is «photo restoration»? It consists of three steps:

  • we find all the image defects: fractures, scuffs, holes;
  • we inpaint the discovered defects, based on the pixel values around them;
  • we colorize the image.

Further, I’ll describe every step of photo restoration and tell you how we got our data, what nets we trained, what we accomplished, and what mistakes we made.
Read more →
Total votes 34: ↑33 and ↓1 +32
Comments 4

Dog Breed Identifier: Full Cycle Development from Keras Program to Android App. on Play Market

Reading time 25 min
Views 16K
With the recent progress in Neural Networks in general and image Recognition particularly, it might seem that creating an NN-based application for image recognition is a simple routine operation. Well, to some extent it is true: if you can imagine an application of image recognition, then most likely someone have already did something similar. All you need to do is to Google it up and to repeat.

However, there are still countless little details that… they are not insolvable, no. They simply take too much of your time, especially if you are a beginner. What would be of help is a step-by-step project, done right in front of you, start to end. A project that does not contain «this part is obvious so let's skip it» statements. Well, almost :)

In this tutorial we are going to walk through a Dog Breed Identifier: we will create and teach a Neural Network, then we will port it to Java for Android and publish on Google Play.

For those of you who want to see a end result, here is the link to NeuroDog App on Google Play.

Web site with my robotics: robotics.snowcron.com.
Web site with: NeuroDog User Guide.

Here is a screenshot of the program:

image

Read more →
Total votes 11: ↑11 and ↓0 +11
Comments 6

Authors' contribution