(AI theory note 28)"AI and the history of digital culture."Generative AI media (2022â) is the fourth major effect of the gradual emergence of a giant digital universe of cultural content (the web). It was born in 1993, when the first graphic browser started popularizing the web. During the 30 years that passed, we see many effects, including these 4:1. The first effect is the switch from categorical, hierarchical and structured organization of information (exemplified by library catalogs and early web directories) to search engines in the late 1990s. There is so much content that organizing it in conventional ways is no longer practical, and search becomes the new default. (Note that search is based on a prediction of what will be most relevant to the user as opposed to giving you a precise answer. AI today is also predictive...)2. The second major effect that interests me is the development of data visualization in first part of the 2000s - another attempt to deal with new data scale. It emerged as a new hip cultural field in 2005-2006. If search attempts to find the most relevant items in the giant data universe, visualization tries to show it as a whole.3. The third effect is the emergence of "data science" as the master discipline of the new era at the end of the 2000s. (While most methods data science uses have already been available for decades, the rapid increase in unstructured data in the 2000s motivated the development of data science as the key new profession of the data society.) My own version of this stage - "cultural analytics" research, 2005-2020. And of course my main method for this research is data visualization but applied to media (images, film, etc.)4. The next, and certainly not the last, major effect of the growth of âdigital content society is generative media (GenAI) in the early 2020s. If data science focuses on finding patterns, relations, groupings, and outliers in big data, and if data visualization tries to summarize it visually, generative AI (i.e. particular  machine learning pipelines) approaches âbig contentâ in a different way. It allows the generation of new content with statistical properties related to existing content. Moreover, this new content (texts, images, animation, 3D models, etc.) can interpolate between existing data points, thus creating new content that goes beyond a summary of what already exists. Thus, we move from probabalistic search to probabalistic media generation, 1999 - 2022.Â
"Artificial Intelligence (AI) Protocols
The GRAMMY Award recognizes creative excellence. Only human creators are eligible to be submitted for consideration for, nominated for, or win a GRAMMY Award. A work that contains no human authorship is not eligible in any Categories. A work that features elements of A.I. material (i.e., material generated by the use of artificial intelligence technology) is eligible in applicable Categories; however: (1) the human authorship component of the work submitted must be meaningful and more than de minimis; (2) such human authorship component must be relevant to the Category in which such work is entered (e.g., if the work is submitted in a songwriting Category, there must be meaningful and more than de minimis human authorship in respect of the music and/or lyrics; if the work is submitted in a performance Category, there must be meaningful and more than de minimis human authorship in respect of the performance); and (3) the author(s) of any A.I. material incorporated into the work are not eligible to be nominees or GRAMMY recipients insofar as their contribution to the portion of the work that consists of such A.I material is concerned. De minimis is defined as lacking significance or importance; so minor as to merit disregard."
GitHub - NVlabs/instant-ngp: Instant neural graphics primitives: lightning fast NeRF and more
https://github.com/NVlabs/instant-ngp
GitHub - tetragonites/wolf3d: code four AI Alg Art project number 2code four AI Alg Art project number 2. Contribute to tetragonites/wolf3d development
https://github.com/tetragonites/wolf3d
https://github.com/users/tetragonites/projects/1
Group 3
Class Notes
How Sam Altman Stormed Washington to Set the A.I. Agenda.The chief executive of OpenAI, which makes ChatGPT, has met with at least 100 U.S. lawmakers in recent months. He has also taken his show abroad.
Google has created a Generative AI learning path with 9 FREE courses!
Topics cover:
- Intro to LLMs
- Attention Mechanism
- Image Generation/Captioning
- Intro to Responsible AI
From the fundamentals of LLMs to creating & deploying generative AI solutions!
Read more
Introduction to Generative AI:
An introductory level micro-learning course aimed at explaining:
- What Generative AI is
- How it is used
- How it differs from traditional ML
Check this out
https://lnkd.in/duKJ3sm6
Introduction to Large Language Models:
The course explores:
- Fundamentals LLMs
- Their use cases
- Prompt engineering on LLMs
Check this out
https://lnkd.in/dm_yS4MQ
Introduction to Responsible AI:
The course explains what responsible AI is, why it's important, and how Google implements responsible AI in their products.
Check this out
https://lnkd.in/dV8zNvwm
Introduction to Image Generation:
This course introduces diffusion models, a family of ML models that recently showed promise in the image generation space.
Check this out
https://lnkd.in/dqcZBZqd
Encoder-Decoder Architecture:
This course gives you a synopsis of the encoder-decoder architecture.
It's a powerful and prevalent machine learning architecture for sequence-to-sequence tasks.
Check this out
https://lnkd.in/dhDhUgwJ
Attention Mechanism:
The course teaches you how attention works & how it revolutionised:
- machine translation
- text summarisation
- question answering
Check this out
https://lnkd.in/dwsZZw6j
Transformer Models and BERT Model:
This course introduces you to some of the most famous and effective transformer architectures!
Check this out
https://lnkd.in/dK4p3n2s
Create Image Captioning Models:
This course teaches you how to create an image captioning model by using deep learning.
Check this out
https://lnkd.in/d8w32x5Y
Introduction to Generative AI Studio:
This course introduces Generative AI Studio, a product on Vertex AI.
It teaches you to prototype and customize generative AI models so you can use their capabilities in your applications.
Check this out
https://lnkd.in/dAXdSrEXQwiklabsIntroduction to Generative AI | Google Cloud Skills BoostThis is an introductory level microlearning course aimed at explaining what Generative AI is, how it is used, and how it differs from traditional machine learning methods. It also covers Google Tools to help you develop your own Gen AI apps.This course is estimated to take approximately 45 minutes to complete.lnkd.inLinkedInThis link will take you to a page thatâs not on LinkedInQwiklabsIntroduction to Responsible AI | Google Cloud Skills BoostThis is an introductory-level microlearning course aimed at explaining what responsible AI is, why it's important, and how Google implements responsible AI in their products. It also introduces Google's 7 AI principles.QwiklabsIntroduction to Image Generation | Google Cloud Skills BoostThis course introduces diffusion models, a family of machine learning models that recently showed promise in the image generation space. Diffusion models draw inspiration from physics, specifically thermodynamics. Within the last few years, diffusion models became popular in both research and industry. Diffusion models underpin many state-of-the-art image generation models and tools on Google Cloud. This course introduces you to the theory behind diffusion models and how to train and deploy them on Vertex AI.QwiklabsEncoder-Decoder Architecture | Google Cloud Skills BoostThis course gives you a synopsis of the encoder-decoder architecture, which is a powerful and prevalent machine learning architecture for sequence-to-sequence tasks such as machine translation, text summarization, and question answering. You learn about the main components of the encoder-decoder architecture and how to train and serve these models. In the corresponding lab walkthrough, youâll code in TensorFlow a simple implementation of the encoder-decoder architecture for poetry generation from the beginning.
Audiocraft / MusicGen - AMAZING Text-To-Music AI Model By Facebook | Tutorial | Better Than MusicLMÂ
https://www.youtube.com/watch?v=v-YpvPkhdO4
AlphaDev, an AI system using reinforcement learning to discover enhanced computer science algorithms.
Celebrities that promote Bored Ape NFTs are now facing a class action lawsuit - The list of defendants might as well be a Coachella lineup with names like Diplo, Snoop Dogg, Post Malone, and The Weeknd cited alongside Yuga Labs Inc as well as Justin Bieber, Paris Hilton, Madonna, Jimmy Fallon, and Kevin Hart
Grimes clones her voice - anyone can make a song with it - https://elf.tech/connect
Martian Love - Where You Want - AI Movie
https://www.youtube.com/watch?v=hQG9ZAdjsPI
Lorn Music Video
https://www.youtube.com/watch?v=3ApMSKql23I
AI Based Copycats
https://www.youtube.com/watch?v=McM3CfDjGs0
System Aesthetics - We are now in transition from an object-oriented to a systems-oriented culture. Here change emanates, not from things, but from the way things are done.
"AIBOâAn Emotionally Intelligent Artificial Intelligence Brainwave OperaâOr I Built A "Sicko" AI, and So Can You" in Information Disorder - Algorithms and Society published by Routledge Books - https://www.routledge.com/Information-Disorder-Algorithms-and-Society/Filimowicz/p/book/9781032290775
MMS: Massively Multilingual Speech.
- Can do speech2text and text speech in 1100 languages.
- Can recognize 4000 spoken languages.
- Code and models available under the CC-BY-NC 4.0 license.
- half the word error rate of Whisper.
https://ai.facebook.com/blog/multilingual-model-speech-recognition/?utm_source=twitter&utm_medium=organic_social&utm_campaign=blog&utm_content=cardMeta AI
Group 2
Using .skn skin file as model texture  ¡ Issue #128 ¡ deepmind/mujoco
(PDF) Touching Virtual Humans: Haptic Responses Reveal the Emotional Impact of Affective Agents
PDF | Interpersonal touch is critical for social-emotional development and presents a powerful modality for communicating emotions.Â
Progress with implementing learning with the 3D model in a physical environment has been slow, but steady. Initilly we were experiencing issues with the 3D model imported from Maya into Unity, but were able to get it working with Unity's physics engine as you can see here:
With this 3D model we were able to get the character moving from place to place using manually assigned ranges of motion for each joint:
But were having some issues with the 3D model not holding it's form together in the way we hoped it would.
In order to remedy these issues of the 3D model not working expected, we have opted to rebuild it in Unity, where it should work in a more predictable way:
On the visualization side, we are developing hand-drawn animations that we hope will play over the surface of the character model, stills of which you can find here:
This approach my need to change now that the model is built within Unity and may be harder to extract back out of it for the purposes of rendering.
We are also beginning to consider the environment that the wolf will have to traverse. We have acquired the heightmap of a 1 km square around WÄgrĂłw, which we have converted into geometry which can act as the base surface for the wolf to walk on:
In the physical world, the wolf's rig has become a part of the bell that will be in the installation and later public monument in WÄgrĂłw, as a sort of zoetrope going along the top of a bell made by the Kruszewscy Bell Company in WÄgrĂłw, the same company that has mae and maintains all of the church bells in WÄgrĂłw. Here is an image of the bell after just coming out of casting in brass, before polishing and weatherproofing:
Kandinsky 2.1 inherits best practicies from Dall-E 2 and Latent diffusion, while introducing some new ideas.
As text and image encoder it uses CLIP model and diffusion image prior (mapping) between latent spaces of CLIP modalities. This approach increases the visual performance of the model and unveils new horizons in blending images and text-guided image manipulation.
For diffusion mapping of latent spaces we use transformer with num_layers=20, num_heads=32 and hidden_size=2048.
Other architecture parts:
Text encoder (XLM-Roberta-Large-Vit-L-14) - 560M
Diffusion Image Prior â 1B
CLIP image encoder (ViT-L/14) - 427M
Latent Diffusion U-Net - 1.22B
MoVQ encoder/decoder - 67M
Kandinsky 2.1 was trained on a large-scale image-text dataset LAION HighRes and fine-tuned on our internal datasets.