Tumgik
ulaulaman · 3 months
Text
Tumblr media
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.
I would like to underline that "strategically deceptive behavior"…
Image credit: from Avengers #55 by Roy Rhomas and John Buscema
0 notes
ulaulaman · 3 months
Text
Tumblr media Tumblr media Tumblr media
The eyes of the victims are all the same.
(from Marvels #2 by Kurt Busiek and Alex Ross)
1 note · View note
ulaulaman · 4 months
Text
Tumblr media
How KISS applied makeup before concerts
I dip my fingers into the tub of white goo and start applying it all over my face, leaving some space open around my right eye, where the rough outline of the star will be. Once the white is on, I take the pointed end of a beautician’s comb, one with a metal point, and sketch the outline of the star, freehand, around my right eye. It leaves a line through the white makeup. Then with a Q-tip I clean up the inside of the star. I also clean up the shape of my lips. - Paul Stanley
Photo credit: Waring Abbott/Getty Images
2 notes · View notes
ulaulaman · 4 months
Photo
Tumblr media
LuisMelon on deviantART
13 notes · View notes
ulaulaman · 6 months
Text
Tumblr media
Christoffer Wilhelm Eckersberg (1783–1853) was a Danish painter born in in the southern part of Jutland. He went on to lay the foundation for the Golden Age of Danish Painting and is referred to as the Father of Danish painting. While an apprentice, he produced proficient drawings and paintings, but soon, having amassed some financial support from local well-wishers, he arrived at Copenhagen's Royal Danish Academy in May 1803. He was accepted into the Academy without payment. Because of conflict with his teachers, he did not win the Academy’s gold medal until 1809, after his principal tutor died. In 1810, Eckersberg married Christine Rebecca Hyssing – against his wishes – in order to legitimize his son, Erling Carl Vilhelm Eckersberg. Erling eventually followed in Eckersberg’s footsteps with an Academy education, and a career as a copperplate engraver. Eager to travel, partly in order to escape the reality of this marriage, and just a few days after the wedding, Eckersberg made his way over Germany to Paris. Here he studied under neoclassicist Jacques-Louis David from 1811 to 1812. He improved his skills in painting the human form and followed his teacher's admonition to paint after Nature and the Antique to find Truth. After two years he traveled further to Florence and Rome, where he continued his studies from 1813-1816. Eckersberg painted one of his best portraits, a portrait of his mentor Bertel Thorvaldsen, in Rome in 1814, which was donated to the Academy of Art. Life in Rome agreed with him, and he was greatly affected by the bright southern light he experienced there. He produced a large body of work during those years, including several exceptional landscape studies, including this one. "A View through Three of the North-Western Arches of the Third Storey of the Coliseum" was painted in 1815 or 1816 when Eckersberg sojourned in Rome, painting a series of works of the ancient ruins of the city. The details of the ruins are precisely observed as they appear at the site in Rome. The views of the city, however, are a construction as Eckersberg connected three separate views to create a new harmony. The Royal Engraving Collection has two sketches Eckersberg did to plan his work. It is a prime example of Danish Golden Age painting. Eckersberg returned to Copenhagen and was named professor at the Academy, a position that had specifically been held open a decade awaiting his return. His greatest contribution to painting was through his revitalized teaching method of taking students out into the field, where they were challenged to do studies from nature. He introduced direct study from nature into Danish art, and encouraged his students to develop their individual strengths, thus creating unique styles. He developed an increasing interest in perspective because of his marine paintings. He wrote a dissertation on the subject called "Linear perspective used in the art of painting" and taught classes on the subject at the Academy. He made a small number of etchings that combine daily life observations with classical, harmonious principles of composition. This led the way to the characteristic way Danish Golden Age painters portrayed common, everyday life.
- Clinton Pittman
1 note · View note
ulaulaman · 7 months
Text
Tumblr media
Broken relativity by Sam Chivers
2 notes · View notes
ulaulaman · 7 months
Text
Tumblr media
Posti inconsueti dove trovare l'equazione di Eulero
L'equazione di Eulero, definita come una delle più belle equazioni della matematica, riunisce in se due costanti matematiche, due generi di numeri differenti e una serie particolare di trasformazioni dello spazio. Le costanti sono il numero di Eulero, e, e il pi greco, la costante matematica per eccellenza. Le tipologie di numeri sono quelli reali, rappresentati da e, pi (che sono trascendenti) e in parte da 1 (intero) e quelli immaginari, rappresentati da i, la radicebquadrata di -1. Le trasformazioni di simmetria coinvolte sono invece le rotazioni.
Continua su DropSea...
1 note · View note
ulaulaman · 7 months
Photo
Tumblr media
The child learns by believing the adult. Doubt comes after belief.
Ludwig Wittgenstein
1 note · View note
ulaulaman · 1 year
Photo
Tumblr media
Jessie Tarbox Beals (December 23, 1870 – May 30, 1942) was an American photographer, the first published female photojournalist in the United States and the first female night photographer. She is best known for her freelance news photographs, particularly of the 1904 St. Louis World's Fair, and portraits of places such as Bohemian Greenwich Village. Her trademarks were her self-described "ability to hustle" and her tenacity in overcoming gender barriers in her profession.
13 notes · View notes
ulaulaman · 2 years
Video
youtube
The first exytagalactic black hole
read more
3 notes · View notes
ulaulaman · 2 years
Photo
Tumblr media
Craters of the far side of the Moon - via commons
NASA Lunar Orbiter probes
13 notes · View notes
ulaulaman · 2 years
Photo
Tumblr media
Un giro sui #navigli #milano — view on Instagram https://ift.tt/HKZFDU9
2 notes · View notes
ulaulaman · 2 years
Photo
Tumblr media
#hyperboliccosine #math #mathematics #streetart #milano — view on Instagram https://ift.tt/H0o6fOa
2 notes · View notes
ulaulaman · 2 years
Photo
Tumblr media
#GaetanoLiguori #pianocity #milano #jazz — view on Instagram https://ift.tt/HIrKAU5
0 notes
ulaulaman · 2 years
Photo
Tumblr media
Gravity's Grin
Albert Einstein's general theory of relativity, published over 100 years ago, predicted the phenomenon of gravitational lensing. And that's what gives these distant galaxies such a whimsical appearance, seen through the looking glass of X-ray and optical image data from the Chandra and Hubble space telescopes. Nicknamed the Cheshire Cat galaxy group, the group's two large elliptical galaxies are suggestively framed by arcs. The arcs are optical images of distant background galaxies lensed by the foreground group's total distribution of gravitational mass. Of course, that gravitational mass is dominated by dark matter. The two large elliptical "eye" galaxies represent the brightest members of their own galaxy groups which are merging. Their relative collisional speed of nearly 1,350 kilometers/second heats gas to millions of degrees producing the X-ray glow shown in purple hues. Curiouser about galaxy group mergers? The Cheshire Cat group grins in the constellation Ursa Major, some 4.6 billion light-years away.
Image Credit: X-ray - NASA / CXC / J. Irwin et al. ; Optical - NASA/STScI
1 note · View note
ulaulaman · 2 years
Photo
Tumblr media
#ladybug #nature #natureincity #OrtoBotanico #OrtoBotanicoBrera #Milano #cosebelle — view on Instagram https://ift.tt/rxHGbB5
0 notes
ulaulaman · 2 years
Photo
Tumblr media
#beauty #health and #trash — view on Instagram https://ift.tt/OgBnNcY
0 notes