<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Fergus Kidd]]></title><description><![CDATA[Fergus Kidd]]></description><link>https://fergusblog.azurewebsites.net/</link><image><url>https://fergusblog.azurewebsites.net/favicon.png</url><title>Fergus Kidd</title><link>https://fergusblog.azurewebsites.net/</link></image><generator>Ghost 4.47</generator><lastBuildDate>Sat, 11 Apr 2026 19:28:25 GMT</lastBuildDate><atom:link href="https://fergusblog.azurewebsites.net/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[AI generated music video with Suno and Sora]]></title><description><![CDATA[<figure class="kg-card kg-video-card kg-card-hascaption"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2025/04/My-Movie-1.mp4" poster="https://img.spacergif.org/v1/960x540/0a/spacer.png" width="960" height="540" playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2025/04/media-thumbnail-ember191.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div><figcaption>An AI generated music video based on random words sourced from inputs on a social media post</figcaption></figure>]]></description><link>https://fergusblog.azurewebsites.net/ai-generated-music-videos-with-suno-and-sora/</link><guid isPermaLink="false">67ffef9f08d10000010e835a</guid><category><![CDATA[AI]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Wed, 16 Apr 2025 18:01:42 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2025/04/Screenshot-2025-04-16-at-18.58.46.png" medium="image"/><content:encoded><![CDATA[<figure class="kg-card kg-video-card kg-card-hascaption"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2025/04/My-Movie-1.mp4" poster="https://img.spacergif.org/v1/960x540/0a/spacer.png" width="960" height="540" playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2025/04/media-thumbnail-ember191.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div><figcaption>An AI generated music video based on random words sourced from inputs on a social media post</figcaption></figure>]]></content:encoded></item><item><title><![CDATA[The Human Agent Connection]]></title><description><![CDATA[<figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/LGOBPPi5_gw?list=PLKN6Sz7yuHvWP1_rzakBox88rdhJGJTfd" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></figure><p>Episode 5 - The Human Agent Connection<br></p><p>I got to join George Sims and James Woodall on their &apos;GoHAndsFree&apos; podcast to chat about RealWear and the future of Human and AI collaborative systems.<br></p>]]></description><link>https://fergusblog.azurewebsites.net/the-human-agent-connection/</link><guid isPermaLink="false">674071193c358200014c4f2e</guid><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Fri, 22 Nov 2024 11:57:52 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2024/11/Screenshot-2024-11-22-at-11.57.23.png" medium="image"/><content:encoded><![CDATA[<figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/LGOBPPi5_gw?list=PLKN6Sz7yuHvWP1_rzakBox88rdhJGJTfd" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></figure><img src="https://fergusblog.azurewebsites.net/content/images/2024/11/Screenshot-2024-11-22-at-11.57.23.png" alt="The Human Agent Connection"><p>Episode 5 - The Human Agent Connection<br></p><p>I got to join George Sims and James Woodall on their &apos;GoHAndsFree&apos; podcast to chat about RealWear and the future of Human and AI collaborative systems.<br></p>]]></content:encoded></item><item><title><![CDATA[LIVE! AI Self Interview]]></title><description><![CDATA[<figure class="kg-card kg-video-card"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2024/10/Live-Self-Interview.mp4" poster="https://img.spacergif.org/v1/1920x1080/0a/spacer.png" width="1920" height="1080" playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2024/10/media-thumbnail-ember149.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div></figure><p>I interviewed my AI self live.<br>No edits, no cuts, no pre-processed answers, just a short conversation with myself...<br><br>This uses a brand spanking new live interactive avatar from <a href="https://app.heygen.com/labs">HeyGen Labs</a>, and OpenAI with access to a small knowledge base about me. Now I can finally</p>]]></description><link>https://fergusblog.azurewebsites.net/live-ai-self-interview/</link><guid isPermaLink="false">6716273c28de6a00015ee27c</guid><category><![CDATA[AI]]></category><category><![CDATA[News]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Mon, 21 Oct 2024 11:06:30 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2024/10/Screenshot-2024-10-21-at-11.05.42.png" medium="image"/><content:encoded><![CDATA[<figure class="kg-card kg-video-card"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2024/10/Live-Self-Interview.mp4" poster="https://img.spacergif.org/v1/1920x1080/0a/spacer.png" width="1920" height="1080" playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2024/10/media-thumbnail-ember149.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div></figure><img src="https://fergusblog.azurewebsites.net/content/images/2024/10/Screenshot-2024-10-21-at-11.05.42.png" alt="LIVE! AI Self Interview"><p>I interviewed my AI self live.<br>No edits, no cuts, no pre-processed answers, just a short conversation with myself...<br><br>This uses a brand spanking new live interactive avatar from <a href="https://app.heygen.com/labs">HeyGen Labs</a>, and OpenAI with access to a small knowledge base about me. Now I can finally be in two places at once! HeyGen has already included Zoom integration, so it&apos;s easy for digital me to turn up to digital meetings. Fun for a prank, or revolutionary for 1-1 customer Question Answering, you decide.<br><br>DISCLAIMER - real me has never pronounced data like that and I never will &#x1F624;</p>]]></content:encoded></item><item><title><![CDATA[AI Videos with Runway]]></title><description><![CDATA[<p>I&apos;ve been playing a little with AI generated videos.<br><br>I used a very basic setup, taking a photo of me, and extending it to the correct frame dimensions with photoshop&apos;s generative fill.</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-start.png" class="kg-image" alt loading="lazy" width="1890" height="1417" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2024/09/Untitled-start.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2024/09/Untitled-start.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2024/09/Untitled-start.png 1600w, https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-start.png 1890w" sizes="(min-width: 720px) 720px"></figure><p>became:</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-2.png" class="kg-image" alt loading="lazy" width="1890" height="1417" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2024/09/Untitled-2.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2024/09/Untitled-2.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2024/09/Untitled-2.png 1600w, https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-2.png 1890w" sizes="(min-width: 720px) 720px"></figure><p>Then I simply ran it through <a href="https://runwayml.com/">runway</a>, and looked at the results.</p><figure class="kg-card kg-video-card"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2024/09/715fab5e-884b-4b54-8412-2c2caa9e4c93.mp4" poster="https://img.spacergif.org/v1/1280x768/0a/spacer.png" width="1280" height="768" loop autoplay muted playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2024/09/media-thumbnail-ember115.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container kg-video-hide"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:</span></div></div></div></figure>]]></description><link>https://fergusblog.azurewebsites.net/ai-videos/</link><guid isPermaLink="false">66e06f9355aa0d000107fa7a</guid><category><![CDATA[AI]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Tue, 10 Sep 2024 16:22:13 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-2-1.png" medium="image"/><content:encoded><![CDATA[<img src="https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-2-1.png" alt="AI Videos with Runway"><p>I&apos;ve been playing a little with AI generated videos.<br><br>I used a very basic setup, taking a photo of me, and extending it to the correct frame dimensions with photoshop&apos;s generative fill.</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-start.png" class="kg-image" alt="AI Videos with Runway" loading="lazy" width="1890" height="1417" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2024/09/Untitled-start.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2024/09/Untitled-start.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2024/09/Untitled-start.png 1600w, https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-start.png 1890w" sizes="(min-width: 720px) 720px"></figure><p>became:</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-2.png" class="kg-image" alt="AI Videos with Runway" loading="lazy" width="1890" height="1417" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2024/09/Untitled-2.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2024/09/Untitled-2.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2024/09/Untitled-2.png 1600w, https://fergusblog.azurewebsites.net/content/images/2024/09/Untitled-2.png 1890w" sizes="(min-width: 720px) 720px"></figure><p>Then I simply ran it through <a href="https://runwayml.com/">runway</a>, and looked at the results.</p><figure class="kg-card kg-video-card"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2024/09/715fab5e-884b-4b54-8412-2c2caa9e4c93.mp4" poster="https://img.spacergif.org/v1/1280x768/0a/spacer.png" width="1280" height="768" loop autoplay muted playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2024/09/media-thumbnail-ember115.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container kg-video-hide"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div></figure><p>This was the output with the least deformation of my face. (note I had a slightly different room option to start this one from).</p><p>But there were some more exciting examples:</p><figure class="kg-card kg-video-card"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2024/09/6f778b11-e2c3-4f1f-a3a2-0eef695ce585.mp4" poster="https://img.spacergif.org/v1/1280x768/0a/spacer.png" width="1280" height="768" loop autoplay muted playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2024/09/media-thumbnail-ember138.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container kg-video-hide"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div></figure><p>and:</p><figure class="kg-card kg-video-card"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2024/09/d345b91c-d274-4b0b-8726-58d3794d6b7f.mp4" poster="https://img.spacergif.org/v1/1280x768/0a/spacer.png" width="1280" height="768" loop autoplay muted playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2024/09/media-thumbnail-ember150.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container kg-video-hide"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div></figure><p><br>These include some fun, dynamic, if not a bit mad, movements from Rory the robot.</p><p><br>There is quite a lot of unwanted deformation here, and certainly, improvements to be made, but human faces and features are tough to get right, as our brains are incredibly good at detecting them and noticing things that go wrong. I am not a fan of the ageing process it seems to put me through when I move my head, though...<br></p><p>Some incredible things to note though, are the really accurate shadows cast on the wall from the robot as it moves, these sorts of details will really add to the believability as the tools improve, the pace of which by the way, is rapid.</p>]]></content:encoded></item><item><title><![CDATA[AI Learns to Game. Sort of.]]></title><description><![CDATA[<p>I&apos;ve recently become interested in AI, specifically generative AI, doing things it really wasn&apos;t built to do. I also love a good visualisation, so I combined the two things to teach GPT-V to game and then used this to explore how we can better outcomes from</p>]]></description><link>https://fergusblog.azurewebsites.net/ai-learns-to-game-sort-of/</link><guid isPermaLink="false">669ac2cde5007a0001ba104d</guid><category><![CDATA[AI]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Fri, 19 Jul 2024 20:09:30 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2024/07/frame_140.png" medium="image"/><content:encoded><![CDATA[<img src="https://fergusblog.azurewebsites.net/content/images/2024/07/frame_140.png" alt="AI Learns to Game. Sort of."><p>I&apos;ve recently become interested in AI, specifically generative AI, doing things it really wasn&apos;t built to do. I also love a good visualisation, so I combined the two things to teach GPT-V to game and then used this to explore how we can better outcomes from generic models, using a bit of sensible structure and guidance.<br>Using an open-source game example from pregame, I asked GPT-V to defend this open-source planet from the alien invasion.</p><p>I started by showing GPT-V a single frame of the game and asking it to decide whether to fire the gun, wait, move left, or move right.</p><p>The result is less than stellar. Although it can do basic movements, it fails to come up with any strategy, and just sits in the corner (this will be a common thread...)</p><p>In this example, the AI is only shown one frame. It has no capacity to remember previous states or previous instructions. Looking at the log just shows a lot of &apos;FIRE!&apos; as it has no reference that it has exceeded the total number of shots. It gets stuck in the corner, and whilst there is a glimmer of hope as it attempts to run from the kill-shot bomb, it is ruined by the decision to retreat to the corner instantaneously to its death.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2024/07/ezgif-5-4fff8c2361.gif" class="kg-image" alt="AI Learns to Game. Sort of." loading="lazy" width="640" height="480"><figcaption>AI defends a planet, with a sinlge frame of reference. Score 5</figcaption></figure><p>The next thing I tried was expanding its memory to 10 frames so that it could see its last positions and moves. This is always a good idea for added context, but it doesn&apos;t help in our case, as the &apos;hide in the corner&apos; strategy that is reached by accident, is fundamentally flawed.</p><p>The next step was to introduce a new &apos;agent&apos; to the scenario. This &apos;agent&apos; is simply a piece of code that feeds GPT-V some relevant text-based information, like how many shots have been fired, how many bombs are active, and where on the screen the player is. (My hope was that it would recognise that it was getting stuck in the corner.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2024/07/ezgif-5-8b2afb1b14.gif" class="kg-image" alt="AI Learns to Game. Sort of." loading="lazy" width="640" height="480"><figcaption>Now the AI has slightly more information about the scenario - Score 7</figcaption></figure><p>This slightly improves our outlook. We get stuck in the corner less, and the gameplay is a bit more dynamic. There is some movement to the shots, but ultimately, the vision AI fails to see the obvious threat above the player, and simply waits for its impending doom.</p><p>This is strange behaviour though, because if you ask GPT-V where the bombs are, it can accurately tell you that a bomb is in a threatening place, but the decision isn&apos;t made to move out of the way based on visual information alone.<br>This is where a truly multi-agent approach comes in. This time, I set it up so that not only do we have an &apos;intelligent&apos; agent playing the game and a basic agent giving some live stats, but now I introduce a separate &apos;intelligent&apos; strategy agent. This agent is the same GPT-V technology, but rather than ask to give an order to move or fire, it is asked to give a strategy and explain to the player agent where the bombs are and where the danger comes from.</p><p>This was interesting, as the strategy agent would add information it gathered from the frame to add additional context.</p><div class="kg-card kg-callout-card kg-callout-card-blue"><div class="kg-callout-text">&quot;Looking at the layout, the truck should move to the left to evade the falling bomb, as moving to the right would bring it closer to the corner, potentially trapping it. Therefore, the truck should move left immediately to stay safe.&quot; - Strategy Agent 2024</div></div><p>This additional context, genuinely changes the way the player operates. Keeping it more likely to be central, and move dynamically and consistently away from threats.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2024/07/ezgif-1-adf5db37ee.gif" class="kg-image" alt="AI Learns to Game. Sort of." loading="lazy" width="640" height="480"><figcaption>High score of 22, with a multi-agent approach</figcaption></figure><p>This time the gameplay is revolutionised. The player moves much more dynamically and knows when to move away from bombs. The corner strategy is still strong, but made less prevalent by the strategy agent. The player also seems to be much better at killing the aliens in this scenario than others. We see around score 17/18 the player actively moving out of the corner to avoid bombs, because it actually knows:</p><ol><li>It will get stuck in the corner</li><li>The bomb is above the player.</li></ol><p><br>All thanks to the strategy agent.</p><p>The player eventually gets trapped in a tight spot and loses the game, but I do think this fun demonstration shows that multi-agent approaches, using both smart and dumber agents, can increase the effectiveness just by using the same pieces of technology.</p>]]></content:encoded></item><item><title><![CDATA[AI For productivity]]></title><description><![CDATA[<figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/9P98PBFVvBc?list=PLoP-4KVd7rByKILe__roYiF4GrDIJfu-I" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></figure>]]></description><link>https://fergusblog.azurewebsites.net/ai-for-priductivity/</link><guid isPermaLink="false">666713d93c916b0001ce0463</guid><category><![CDATA[AI]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Mon, 10 Jun 2024 14:56:00 GMT</pubDate><content:encoded><![CDATA[<figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/9P98PBFVvBc?list=PLoP-4KVd7rByKILe__roYiF4GrDIJfu-I" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></figure>]]></content:encoded></item><item><title><![CDATA[A chat with Rory]]></title><description><![CDATA[<p>Next up in my interviewing AI series... inspired by the OpenAI robotic announcements a week or two ago, I wanted to see how far I could get to replicating their video. We&apos;ve not managed a robotic transformer for our <a href="https://www.linkedin.com/company/hello-robot-inc/">Hello Robot Inc</a> Stretch RE1 (we named Rory) yet,</p>]]></description><link>https://fergusblog.azurewebsites.net/a-chat-with-rory/</link><guid isPermaLink="false">660c2e6c63a75d00015fb598</guid><category><![CDATA[AI]]></category><category><![CDATA[Hardware]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Tue, 02 Apr 2024 16:20:10 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2024/04/Screenshot-2024-04-02-at-17.22.08.png" medium="image"/><content:encoded><![CDATA[<img src="https://fergusblog.azurewebsites.net/content/images/2024/04/Screenshot-2024-04-02-at-17.22.08.png" alt="A chat with Rory"><p>Next up in my interviewing AI series... inspired by the OpenAI robotic announcements a week or two ago, I wanted to see how far I could get to replicating their video. We&apos;ve not managed a robotic transformer for our <a href="https://www.linkedin.com/company/hello-robot-inc/">Hello Robot Inc</a> Stretch RE1 (we named Rory) yet, but watch this space.<br><br>This is a quick interview powered by GPT4 on Microsoft Azure including GPT-V powered vision processing. The voice is powered by OpenAI&apos;s text-to-speech models. I built the robot a self-aware persona with a touch of humour sprinkled in.</p><figure class="kg-card kg-video-card kg-width-wide"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2024/04/RoryAndFergus.mp4" poster="https://img.spacergif.org/v1/1920x1080/0a/spacer.png" width="1920" height="1080" playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2024/04/media-thumbnail-ember121.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div></figure>]]></content:encoded></item><item><title><![CDATA[I Interview Myself. Sort of.]]></title><description><![CDATA[<figure class="kg-card kg-video-card kg-card-hascaption"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2024/01/Fergus-Interview.mp4" poster="https://img.spacergif.org/v1/1920x1080/0a/spacer.png" width="1920" height="1080" playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2024/01/media-thumbnail-ember100.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div><figcaption>Fergus interviews AI Fergus</figcaption></figure><p><strong>An AI description of the video using GPT-4 Vision and Azure Vision AI:</strong><br>In the ever-evolving landscape of artificial intelligence, the fusion of human interaction with AI personas has reached new heights. A remarkable example of this synergy is encapsulated in a</p>]]></description><link>https://fergusblog.azurewebsites.net/i-interview-myself-sort-of/</link><guid isPermaLink="false">65b3b58d1304490001044394</guid><category><![CDATA[AI]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Fri, 26 Jan 2024 14:24:05 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2024/01/Screenshot-2024-01-26-at-14.22.45.png" medium="image"/><content:encoded><![CDATA[<figure class="kg-card kg-video-card kg-card-hascaption"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2024/01/Fergus-Interview.mp4" poster="https://img.spacergif.org/v1/1920x1080/0a/spacer.png" width="1920" height="1080" playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2024/01/media-thumbnail-ember100.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div><figcaption>Fergus interviews AI Fergus</figcaption></figure><img src="https://fergusblog.azurewebsites.net/content/images/2024/01/Screenshot-2024-01-26-at-14.22.45.png" alt="I Interview Myself. Sort of."><p><strong>An AI description of the video using GPT-4 Vision and Azure Vision AI:</strong><br>In the ever-evolving landscape of artificial intelligence, the fusion of human interaction with AI personas has reached new heights. A remarkable example of this synergy is encapsulated in a video that showcases an intriguing interview between Fergus Kid, an R&amp;D engineering lead at Avanade, and his AI counterpart. This innovative interaction not only highlights the capabilities of AI but also demonstrates the creative possibilities when leveraging Azure OpenAI&apos;s GPT-4 and the HeyGen avatar creation platform.<br><br>The video, set against the backdrop of a cozy and well-lit living room, starts with Fergus leaning forward, his curiosity evident as he prepares to dive into a conversation with a digital version of himself. The scene seamlessly transitions to a more formal setting where Fergus&apos;s AI persona is comfortably seated on an orange couch, embodying a serene and professional ambiance within what appears to be a modern office space.<br><br>The interview delves into the personal and professional aspects of Fergus&apos;s life, from his birth in Winchester to his current life in London, his hobbies like gaming and beekeeping, and his profound interest in space and AI. These details not only serve to humanise the AI but also ground the conversation in tangible reality.<br><br>The technology underpinning this fascinating exchange is Azure OpenAI&apos;s GPT-4, a state-of-the-art machine learning platform that utilizes advanced natural language processing to craft responses that are remarkably human-like. GPT-4&apos;s capabilities allow the AI version of Fergus to engage in a fluid dialogue, discussing complex topics such as the ethical implications of AI and the potential benefits of AI personas for society.<br><br>Complementing GPT-4&apos;s linguistic prowess is the HeyGen avatar creation tool, which has been instrumental in bringing the AI persona to life visually. The attention to detail in the avatar&apos;s creation is evident as it mirrors Fergus&apos;s movements and mannerisms, creating an immersive and convincing experience. The avatar&apos;s realistic presence on the orange couch, interacting with its environment in a natural way, underscores the potential of AI in humanizing technology.<br><br>Throughout the video, the interplay of light and shadow, the strategic placement of indoor plants, and the modern decor contribute to a setting that is both inviting and forward-thinking, echoing the innovative spirit of the conversation. The shifts between the living room and office scenes provide a visual rhythm to the dialogue, maintaining viewer engagement.<br><br>This video serves as a testament to the strides made in AI and avatar technology, specifically through the integration of Azure OpenAI&apos;s GPT-4 and HeyGen&apos;s avatar platform. It offers a glimpse into the future of human-AI interaction, where digital personas can not only replicate human conversation but also provide unique insights and augment human capabilities. As the AI Fergus aptly puts it, the key lies in approaching AI development responsibly, ensuring it paves the way for positive outcomes for humanity.<br><br>In creating such a video, the creators have not only pushed the boundaries of what&apos;s possible with AI but have also presented a compelling narrative that encourages viewers to ponder the role of AI in our lives. It&apos;s a blend of technical marvel and creative storytelling that beckons us to consider the harmonious potential of humans and AI coexisting and collaborating.<br><br>As we step into a future where AI becomes increasingly integrated into our daily lives, videos like these are pivotal in shaping our understanding and expectations of this technology. It&apos;s a conversation starter, an educational tool, and a peek into what might soon become a new normal in the interaction between humans and their AI counterparts.</p>]]></content:encoded></item><item><title><![CDATA[Are you even real?]]></title><description><![CDATA[<figure class="kg-card kg-video-card"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2023/10/Untitled-Video--2-.mp4" poster="https://img.spacergif.org/v1/1920x1080/0a/spacer.png" width="1920" height="1080" playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2023/10/media-thumbnail-ember115.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div></figure><p>Make your own: <a href="https://app.heygen.com/">https://app.heygen.com/</a></p>]]></description><link>https://fergusblog.azurewebsites.net/are-you-even-real/</link><guid isPermaLink="false">6540d51d3ffd0500016acfe1</guid><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Tue, 31 Oct 2023 10:22:28 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2023/10/Screenshot-2023-10-31-at-10.21.06.png" medium="image"/><content:encoded><![CDATA[<figure class="kg-card kg-video-card"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2023/10/Untitled-Video--2-.mp4" poster="https://img.spacergif.org/v1/1920x1080/0a/spacer.png" width="1920" height="1080" playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2023/10/media-thumbnail-ember115.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div></figure><img src="https://fergusblog.azurewebsites.net/content/images/2023/10/Screenshot-2023-10-31-at-10.21.06.png" alt="Are you even real?"><p>Make your own: <a href="https://app.heygen.com/">https://app.heygen.com/</a></p>]]></content:encoded></item><item><title><![CDATA[Voice Font - Now With More Neural]]></title><description><![CDATA[<p>One exciting thing I worked on very early on was creating my own custom voice font with Microsoft&apos;s custom voice font to enable text to speech. In the years since we have seen some amazing advancements in neural text to speech which give a much better output.<br><br>When</p>]]></description><link>https://fergusblog.azurewebsites.net/voice-font-now-with-more-neural/</link><guid isPermaLink="false">648852936f01a5000198823d</guid><category><![CDATA[AI]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Tue, 13 Jun 2023 12:44:16 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2023/06/Screenshot-2023-06-13-at-12.34.49.png" medium="image"/><content:encoded><![CDATA[<img src="https://fergusblog.azurewebsites.net/content/images/2023/06/Screenshot-2023-06-13-at-12.34.49.png" alt="Voice Font - Now With More Neural"><p>One exciting thing I worked on very early on was creating my own custom voice font with Microsoft&apos;s custom voice font to enable text to speech. In the years since we have seen some amazing advancements in neural text to speech which give a much better output.<br><br>When I found out neural custom voices were available I couldn&apos;t resist trying one out. I was eager to see how accurate and natural the results would be compared to my previous attempts, which although recognisable, were not quite there.<br><br>For those who are unfamiliar, Microsoft Neural TTS is a cloud-based service that aims to provide highly realistic and expressive speech synthesis using deep neural networks. Voice Fonts, on the other hand, allow users to create custom voices by training the TTS engine on their own voice recordings.<br><br>To begin my experiment, I signed up for the Microsoft Azure Cognitive Services and followed the provided guidelines for recording my 50 utterances. The process was relatively straightforward, with clear instructions on how to maintain consistent pitch, tone, and pacing throughout the recordings. These are recorded right in the portal, which is super easy. I did use a much better quality microphone than my attempts several years ago. The training is pretty quick, but to use the font you need to record a short statement that you agree to have your voice replicated. <br>The moment of truth arrived as I typed a sample text into the Neural TTS interface and selected my custom voice. Listen to the results yourself:</p><h3 id="an-excerpt-from-mary-shelleys-frankenstein"><br><br>An excerpt from Mary Shelley&apos;s Frankenstein</h3><div class="kg-card kg-audio-card"><img src alt="Voice Font - Now With More Neural" class="kg-audio-thumbnail kg-audio-hide"><div class="kg-audio-thumbnail placeholder"><svg width="24" height="24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M7.5 15.33a.75.75 0 1 0 0 1.5.75.75 0 0 0 0-1.5Zm-2.25.75a2.25 2.25 0 1 1 4.5 0 2.25 2.25 0 0 1-4.5 0ZM15 13.83a.75.75 0 1 0 0 1.5.75.75 0 0 0 0-1.5Zm-2.25.75a2.25 2.25 0 1 1 4.5 0 2.25 2.25 0 0 1-4.5 0Z"/><path fill-rule="evenodd" clip-rule="evenodd" d="M14.486 6.81A2.25 2.25 0 0 1 17.25 9v5.579a.75.75 0 0 1-1.5 0v-5.58a.75.75 0 0 0-.932-.727.755.755 0 0 1-.059.013l-4.465.744a.75.75 0 0 0-.544.72v6.33a.75.75 0 0 1-1.5 0v-6.33a2.25 2.25 0 0 1 1.763-2.194l4.473-.746Z"/><path fill-rule="evenodd" clip-rule="evenodd" d="M3 1.5a.75.75 0 0 0-.75.75v19.5a.75.75 0 0 0 .75.75h18a.75.75 0 0 0 .75-.75V5.133a.75.75 0 0 0-.225-.535l-.002-.002-3-2.883A.75.75 0 0 0 18 1.5H3ZM1.409.659A2.25 2.25 0 0 1 3 0h15a2.25 2.25 0 0 1 1.568.637l.003.002 3 2.883a2.25 2.25 0 0 1 .679 1.61V21.75A2.25 2.25 0 0 1 21 24H3a2.25 2.25 0 0 1-2.25-2.25V2.25c0-.597.237-1.169.659-1.591Z"/></svg></div><div class="kg-audio-player-container"><audio src="https://fergusblog.azurewebsites.net/content/media/2023/06/audio-file.mp3" preload="metadata"></audio><div class="kg-audio-title">Audio file</div><div class="kg-audio-player"><button class="kg-audio-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-audio-pause-icon kg-audio-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-audio-current-time">0:00</span><div class="kg-audio-time">/<span class="kg-audio-duration">0:28</span></div><input type="range" class="kg-audio-seek-slider" max="100" value="0"><button class="kg-audio-playback-rate">1&#xD7;</button><button class="kg-audio-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-audio-mute-icon kg-audio-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-audio-volume-slider" max="100" value="100"></div></div></div><p><br>This quick experiment has left me eager to explore further and see how much more realistic the Voice Font can become with additional recordings on the professional version, although up to 1000 are required. I&apos;m also looking forward to seeing how Microsoft and other companies continue to push the boundaries of TTS technology in the coming years. It looks like there will be a version native in the next apple iOS iteration later this year. The future of personalized, expressive speech synthesis is here, and it&apos;ll be interesting to see exactly how it gets used... For good and bad...</p>]]></content:encoded></item><item><title><![CDATA[A Robotic Wonderland: ICRA 2023]]></title><description><![CDATA[<p>As a self-proclaimed robotics enthusiast, I could not have been more thrilled to attend and present a late breaking poster at the International Conference on Robotics and Automation (ICRA) 2023. The event exceeded my expectations, with fascinating keynotes, talks, and demonstrations that showcased the cutting edge of robotics technology.</p><p>One</p>]]></description><link>https://fergusblog.azurewebsites.net/a-robotic-wonderland-icra-2023/</link><guid isPermaLink="false">64884ff86f01a50001988206</guid><category><![CDATA[Hardware]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Tue, 13 Jun 2023 11:25:01 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2023/06/IMG_0546.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://fergusblog.azurewebsites.net/content/images/2023/06/IMG_0546.jpg" alt="A Robotic Wonderland: ICRA 2023"><p>As a self-proclaimed robotics enthusiast, I could not have been more thrilled to attend and present a late breaking poster at the International Conference on Robotics and Automation (ICRA) 2023. The event exceeded my expectations, with fascinating keynotes, talks, and demonstrations that showcased the cutting edge of robotics technology.</p><p>One of the most striking aspects of ICRA 2023 was the sheer number of robotic dogs running around the conference venue! It seemed like every other exhibitor had their own version of a robotic canine companion, each with unique features and abilities and custom sensor set. Some were designed for search and rescue missions, while others were more focused on indoor mapping with LIDAR. Regardless of their purpose, seeing so many robot dogs in one place was an unforgettable experience.</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2023/06/IMG_0544.jpg" class="kg-image" alt="A Robotic Wonderland: ICRA 2023" loading="lazy" width="1284" height="1247" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2023/06/IMG_0544.jpg 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2023/06/IMG_0544.jpg 1000w, https://fergusblog.azurewebsites.net/content/images/2023/06/IMG_0544.jpg 1284w" sizes="(min-width: 720px) 720px"></figure><p><br>However, the most intriguing encounters at the conference were the human-robot interactions. As I roamed the exhibition hall, I stumbled upon a stall showcasing a humanoid robot designed to mimic human expressions and emotions from Ameca. </p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2023/06/IMG_0543-1.jpg" class="kg-image" alt="A Robotic Wonderland: ICRA 2023" loading="lazy" width="1284" height="2256" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2023/06/IMG_0543-1.jpg 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2023/06/IMG_0543-1.jpg 1000w, https://fergusblog.azurewebsites.net/content/images/2023/06/IMG_0543-1.jpg 1284w" sizes="(min-width: 720px) 720px"></figure><p><br>It was a fascinating and, admittedly, somewhat creepy experience to witness this human-like robot move so naturally with expression. It was a stark reminder of the rapid advancements being made in the field of robotics and how far we can come when we link this with technologies in like voice fonts and generative AI.<br><br>Another highlight of ICRA 2023 was the incredible array of keynote speakers and talks. Some of the most influential figures in the robotics and automation industry shared their insights on the current state of the field, as well as their predictions for the future. Topics ranged from ethical considerations in agricultural robotics to the latest developments in machine learning and computer vision.<br><br>A surprise at ICRA 2023 was the sheer number of people showcasing simulation technologies. There was a notable emphasis on using virtual environments to train and test robotic systems, with several exhibitors demonstrating their simulation platforms. These tools are becoming increasingly important as the complexity and capabilities of robots continue to grow, allowing developers to refine their creations in a controlled environment before deploying them in the real world. Some of these had fantastic graphical layers that can be used for training computer vision models with synthetic data.<br><br>ICRA 2023 was a robotic wonderland that left me in awe of the innovations and advancements in the field of robotics and automation. From the charming robot dogs to the thought-provoking keynotes and talks, the conference was a testament to the exciting future that lies ahead in this rapidly evolving industry. I can&apos;t wait to see what ICRA 2024 has in store in Japan!</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2023/06/IMG_0545.jpg" class="kg-image" alt="A Robotic Wonderland: ICRA 2023" loading="lazy" width="1201" height="1268" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2023/06/IMG_0545.jpg 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2023/06/IMG_0545.jpg 1000w, https://fergusblog.azurewebsites.net/content/images/2023/06/IMG_0545.jpg 1201w" sizes="(min-width: 720px) 720px"></figure>]]></content:encoded></item><item><title><![CDATA[Voice enabled Chat GPT]]></title><description><![CDATA[<p>OpenAI released it&apos;s model and API that powers the product we see when we use Chat GPT. This means we can start to implement our own solutions around the function by adding it into our custom apps or services. One thing I wanted to do right away was</p>]]></description><link>https://fergusblog.azurewebsites.net/voice-enabled-chat-gpt/</link><guid isPermaLink="false">6408a781cf151900013c1f33</guid><category><![CDATA[AI]]></category><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Wed, 08 Mar 2023 15:27:30 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2023/03/Default_a_very_cool_slender_robot_that_is_intelligent_in_a_sci_fi_set_1_ef10e3db-379d-4178-8e1d-68234ff54a48_1.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://fergusblog.azurewebsites.net/content/images/2023/03/Default_a_very_cool_slender_robot_that_is_intelligent_in_a_sci_fi_set_1_ef10e3db-379d-4178-8e1d-68234ff54a48_1.jpg" alt="Voice enabled Chat GPT"><p>OpenAI released it&apos;s model and API that powers the product we see when we use Chat GPT. This means we can start to implement our own solutions around the function by adding it into our custom apps or services. One thing I wanted to do right away was to get a version working for robots. </p><p>That&apos;s right, a version that will work with voice recognition and voice synthesis. Check it out on GitHub - it&apos;s a basic implementation for now.</p><div class="kg-card kg-button-card kg-align-center"><a href="https://github.com/FergusKidd/Speech-and-Chat-GPT" class="kg-btn kg-btn-accent">GitHub</a></div><p>I use the voice services from Microsoft Azure to take input from the device&apos;s microphone (could be your laptop, could be a robot) and send it off to OpenAI&apos;s &#xA0;GPT 3.5 model for chat. The nice thing about this is that you can include the messaging history to allow for a more natural conversation, as OpenAI can &quot;remember&quot; what you have been talking about. It should work mostly the same, however there are obvious limitations like code generation. You wouldn&apos;t want anyone to dictate code to you for example....<br><br>Anyway, I hope this will be useful to someone learning about GPT and speech, and I&apos;ll enable it on our robot, Rory, soon, to make &apos;him&apos; even more knowledgeble than before.</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2023/03/IMG_02CE018B7DAA-1.jpeg" class="kg-image" alt="Voice enabled Chat GPT" loading="lazy" width="2000" height="3844" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2023/03/IMG_02CE018B7DAA-1.jpeg 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2023/03/IMG_02CE018B7DAA-1.jpeg 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2023/03/IMG_02CE018B7DAA-1.jpeg 1600w, https://fergusblog.azurewebsites.net/content/images/2023/03/IMG_02CE018B7DAA-1.jpeg 2098w" sizes="(min-width: 720px) 720px"></figure>]]></content:encoded></item><item><title><![CDATA[Generated Textures, but with depth]]></title><description><![CDATA[<p>One very cool open source project has taken Stable diffusion into Blender. This means you can generate AI art, whether it be concept art, or a texture, right where you need it.</p><div class="kg-card kg-button-card kg-align-center"><a href="https://github.com/carson-katri/dream-textures" class="kg-btn kg-btn-accent">Their GitHub</a></div><p>One epic feature of this project that was just released though is a new model that</p>]]></description><link>https://fergusblog.azurewebsites.net/generated-textures-but-with-depth/</link><guid isPermaLink="false">63c1772dcf151900013c1ef2</guid><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Fri, 13 Jan 2023 15:54:28 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2023/01/Dream-Textures.png" medium="image"/><content:encoded><![CDATA[<img src="https://fergusblog.azurewebsites.net/content/images/2023/01/Dream-Textures.png" alt="Generated Textures, but with depth"><p>One very cool open source project has taken Stable diffusion into Blender. This means you can generate AI art, whether it be concept art, or a texture, right where you need it.</p><div class="kg-card kg-button-card kg-align-center"><a href="https://github.com/carson-katri/dream-textures" class="kg-btn kg-btn-accent">Their GitHub</a></div><p>One epic feature of this project that was just released though is a new model that can deal with the depth it receives from Blender. So not only does it generate something new, it can also understand your model and wrap the texture around it. It seems to do this by asking the model for a mew flat image, knowing what surfaces it meeds to wrap. It makes for a nice addition, but can drastically reduce the quality of the image if you aren&apos;t zoomed in on your Blender viewport.</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2023/01/Cereal.png" class="kg-image" alt="Generated Textures, but with depth" loading="lazy" width="1920" height="1080" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2023/01/Cereal.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2023/01/Cereal.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2023/01/Cereal.png 1600w, https://fergusblog.azurewebsites.net/content/images/2023/01/Cereal.png 1920w" sizes="(min-width: 720px) 720px"></figure><p>Here I&apos;ve asked for a simple texture of a cereal box on a simple mesh. The output is obviously not perfect, generative AI has never yet been great at rendering text nicely, but it&apos;s very clearly a cereal box that we have rendered out. </p><p>As a distant background object this would be more than enough to quickly fill out a scene, or make interesting new models on the fly. It can use cloud compute if you have a Dream Studio, or locally on your own GPU.</p><figure class="kg-card kg-video-card"><div class="kg-video-container"><video src="https://fergusblog.azurewebsites.net/content/media/2023/01/CerealExample.mp4" poster="https://img.spacergif.org/v1/1920x1080/0a/spacer.png" width="1920" height="1080" loop autoplay muted playsinline preload="metadata" style="background: transparent url(&apos;https://fergusblog.azurewebsites.net/content/images/2023/01/media-thumbnail-ember447.jpg&apos;) 50% 50% / cover no-repeat;"></video><div class="kg-video-overlay"><button class="kg-video-large-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button></div><div class="kg-video-player-container kg-video-hide"><div class="kg-video-player"><button class="kg-video-play-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"/></svg></button><button class="kg-video-pause-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"/><rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"/></svg></button><span class="kg-video-current-time">0:00</span><div class="kg-video-time">/<span class="kg-video-duration"></span></div><input type="range" class="kg-video-seek-slider" max="100" value="0"><button class="kg-video-playback-rate">1&#xD7;</button><button class="kg-video-unmute-icon"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"/></svg></button><button class="kg-video-mute-icon kg-video-hide"><svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24"><path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"/></svg></button><input type="range" class="kg-video-volume-slider" max="100" value="100"></div></div></div></figure><p>Here it is in action, going from a blank cuboid to a fully textured cereal box!</p><p>Blender is using my Nvidia 3080 to render this nice and quickly locally. The nice addition here is that makes it completely free! </p>]]></content:encoded></item><item><title><![CDATA[0 Data Vision Model]]></title><description><![CDATA[<p>Gaming technology and the technology that supports it has come a long way. With more powerful GPUs in our devices than ever, revolutions in chip manufacturing, and efficiencies in calculations allow us to do incredible things like run ray tracing engines in real time, simulate realistic physics, and so much</p>]]></description><link>https://fergusblog.azurewebsites.net/0-data-vision-model/</link><guid isPermaLink="false">639c46c456bf880001dc451e</guid><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Fri, 16 Dec 2022 11:02:35 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2022/12/Screenshot-2022-12-16-at-10.22.42.png" medium="image"/><content:encoded><![CDATA[<img src="https://fergusblog.azurewebsites.net/content/images/2022/12/Screenshot-2022-12-16-at-10.22.42.png" alt="0 Data Vision Model"><p>Gaming technology and the technology that supports it has come a long way. With more powerful GPUs in our devices than ever, revolutions in chip manufacturing, and efficiencies in calculations allow us to do incredible things like run ray tracing engines in real time, simulate realistic physics, and so much more. All of this work has huge impact on gaming and immersive experience, but outside of creative industries what can we do?</p><p>I&apos;ve done a lot of work and learning around the metaverse. As part of this learning I&apos;ve made worlds, designed objects, and learnt a lot about 3D rendering. One thing I did to support some of this research was to 3D model my own building for some experimentation around MetaHumans - more on that soon. Once I had a render of my bedroom behind my workstation, I uploaded it as a Teams camera background. After talking with a few people, I realised not everyone cottoned on that it was even a digital render (the real thing is much messier normally).</p><h2 id="this-gave-me-an-idea">This gave me an idea.</h2><p>If people can&apos;t really tell what&apos;s real once the stream has been compressed and sent over the internet, could a machine vision model? I&apos;ve always been interested in synthetic data creation for machine vision, but usually my starting point has been something real, like a photograph. What if we could make a working machine learning model with no real data at all?</p><p>The use case I chose was to find a tennis ball. Why? Well, it&apos;s about the size that our stretch RE1 robot can grab well, and it won&apos;t matter if it&apos;s dropped or bumps into anything, so, I was hoping the resulting model could be used to help the robot find and grab the tennis ball in the room. That&apos;s a challenge for another day though.</p><p>I started off with the 3D model of my room, and settled on the view from behind my workstation.</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2022/12/Ball.png" class="kg-image" alt="0 Data Vision Model" loading="lazy" width="1920" height="1080" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/12/Ball.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/12/Ball.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2022/12/Ball.png 1600w, https://fergusblog.azurewebsites.net/content/images/2022/12/Ball.png 1920w" sizes="(min-width: 720px) 720px"></figure><p>Then dropped a tennis ball onto the bed. (virtually). I wrote a script that can move the ball around the room, change the lighting conditions, and then automatically tag the ball location and upload it straight to Azure custom vision.</p><div class="kg-card kg-button-card kg-align-center"><a href="https://github.com/FergusKidd/Blender-to-Azure-custom-vision" class="kg-btn kg-btn-accent">Visit the GiHub repo</a></div><p>This uses the cycles engine, with a de-noised image generation capped at about 30 seconds on my machine. The beauty of this is you can easily change or limit the quality output, or scale your compute power to generate many thousands of quality pre-tagged images much faster than you could ever photograph a real scene and manually tag it, assuming you don&apos;t have to build an environment from scratch like I did here.</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2022/12/Screenshot-2022-12-16-at-10.22.42-1.png" class="kg-image" alt="0 Data Vision Model" loading="lazy" width="2000" height="993" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/12/Screenshot-2022-12-16-at-10.22.42-1.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/12/Screenshot-2022-12-16-at-10.22.42-1.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2022/12/Screenshot-2022-12-16-at-10.22.42-1.png 1600w, https://fergusblog.azurewebsites.net/content/images/size/w2400/2022/12/Screenshot-2022-12-16-at-10.22.42-1.png 2400w" sizes="(min-width: 720px) 720px"></figure><p>Playing around with the lighting conditions allows you to simulate lots of different conditions, which from my past research, really help to build a more resilient object detection model.</p><p>All that&apos;s left to do is to click train, and test the results!</p><h2 id="the-real-thing">The real thing</h2><p>I used the minimum number of images required by Azure custom vision, just to push it to the max, but in reality with this method, you can keep adding camera views, lighting setups and noising to you heart&apos;s content.</p><h3 id="first-lets-test-a-real-image-from-a-similar-viewpoint">First, let&apos;s test a real image from a similar viewpoint:</h3><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2022/12/Screenshot-2022-12-16-at-10.48.51.png" class="kg-image" alt="0 Data Vision Model" loading="lazy" width="2000" height="1190" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/12/Screenshot-2022-12-16-at-10.48.51.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/12/Screenshot-2022-12-16-at-10.48.51.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2022/12/Screenshot-2022-12-16-at-10.48.51.png 1600w, https://fergusblog.azurewebsites.net/content/images/size/w2400/2022/12/Screenshot-2022-12-16-at-10.48.51.png 2400w" sizes="(min-width: 720px) 720px"></figure><p>There we have it. The model has never seen a real tennis ball, but can identify the real one with 93.1% confidence, even with the lowest amount of synthetic data possibly generated.</p><h3 id="what-if-we-show-it-an-image-it-wasnt-trained-on-with-more-objects-than-expected">What if we show it an image it wasn&apos;t trained on, with more objects than expected?</h3><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2022/12/Screenshot-2022-12-16-at-10.51.49.png" class="kg-image" alt="0 Data Vision Model" loading="lazy" width="2000" height="1198" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/12/Screenshot-2022-12-16-at-10.51.49.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/12/Screenshot-2022-12-16-at-10.51.49.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2022/12/Screenshot-2022-12-16-at-10.51.49.png 1600w, https://fergusblog.azurewebsites.net/content/images/size/w2400/2022/12/Screenshot-2022-12-16-at-10.51.49.png 2400w" sizes="(min-width: 720px) 720px"></figure><p>Well, we had to reduce the confidence limit, but we have still correctly identified all the tennis balls, even one from the reflection in the mirror.</p><h3 id="next-the-real-thing-from-a-different-viewpoint">Next, the real thing from a different viewpoint:</h3><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2022/12/Screenshot-2022-12-16-at-10.55.35.png" class="kg-image" alt="0 Data Vision Model" loading="lazy" width="2000" height="1201" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/12/Screenshot-2022-12-16-at-10.55.35.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/12/Screenshot-2022-12-16-at-10.55.35.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2022/12/Screenshot-2022-12-16-at-10.55.35.png 1600w, https://fergusblog.azurewebsites.net/content/images/size/w2400/2022/12/Screenshot-2022-12-16-at-10.55.35.png 2400w" sizes="(min-width: 720px) 720px"></figure><p>Not the highest confidences again, but still correct guesses - so this approach is still useable for a first pass where the context of the images meeds to change.</p><h3 id="finally-the-same-object-but-in-a-totally-different-context">Finally, the same object but in a totally different context:</h3><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2022/12/Screenshot-2022-12-16-at-11.00.16.png" class="kg-image" alt="0 Data Vision Model" loading="lazy" width="2000" height="1203" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/12/Screenshot-2022-12-16-at-11.00.16.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/12/Screenshot-2022-12-16-at-11.00.16.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2022/12/Screenshot-2022-12-16-at-11.00.16.png 1600w, https://fergusblog.azurewebsites.net/content/images/size/w2400/2022/12/Screenshot-2022-12-16-at-11.00.16.png 2400w" sizes="(min-width: 720px) 720px"></figure><p>So, in a totally different context, it still works!</p><p>Imagine what we could do with many more renders, with no meed for any manual tagging, and looking at as many objects as we liked. There are great things going on in this sector from the likes of Unity as well, but for me, Blender gives the greatest flexibility on 3D rendering, with beautiful rendering available completely for free.</p><p></p>]]></content:encoded></item><item><title><![CDATA[AI Generated Textures - now in full]]></title><description><![CDATA[<p>We discussed <a href="https://ferguskidd.com/dall-e-2-2/">AI generated textures using DALL-E-2</a> before. However there have been two significant updates.</p><ol><li>The <a href="https://openai.com/blog/dall-e-api-now-available-in-public-beta/">Dall-E-2 API</a> is finally here!</li><li>I figured out how to easily get from a flat texture to a full 3D one using <a href="https://boundingboxsoftware.com/materialize/">Materialize</a>.</li></ol><p>The Dall-E-2 API make it infinitely easier to modify image in</p>]]></description><link>https://fergusblog.azurewebsites.net/ai-generated-textures-now-in-full/</link><guid isPermaLink="false">63653904c9b3c50001bc675f</guid><dc:creator><![CDATA[Fergus Kidd]]></dc:creator><pubDate>Fri, 04 Nov 2022 16:48:18 GMT</pubDate><media:content url="https://fergusblog.azurewebsites.net/content/images/2022/11/Yellow-Brick-Cube-1.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://fergusblog.azurewebsites.net/content/images/2022/11/Yellow-Brick-Cube-1.jpg" alt="AI Generated Textures - now in full"><p>We discussed <a href="https://ferguskidd.com/dall-e-2-2/">AI generated textures using DALL-E-2</a> before. However there have been two significant updates.</p><ol><li>The <a href="https://openai.com/blog/dall-e-api-now-available-in-public-beta/">Dall-E-2 API</a> is finally here!</li><li>I figured out how to easily get from a flat texture to a full 3D one using <a href="https://boundingboxsoftware.com/materialize/">Materialize</a>.</li></ol><p>The Dall-E-2 API make it infinitely easier to modify image in the way we meed to. Instead of manually doing all the image generation, we can thread it into a single python operation. </p><div class="kg-card kg-callout-card kg-callout-card-blue"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">Checkout the <a href="https://github.com/FergusKidd/Seamless-Texture-Generation-with-DALL-E-2">GitHub repo</a> and try it yourself</div></div><p>I&apos;ll show you step by step what we generate. First let&apos;s ask through the API for a texture we might want to see. In this case I&apos;m asking for a yellow brick texture. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/ddf54043-fe11-4336-83ea-dbe7943b27f3.png" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="389" height="389"><figcaption>DALL-E-2 generated texture of a yellow brick wall</figcaption></figure><p>Pretty good start. The problem is if we tile this nicely over a large 3D surface, it won&apos;t look good because the edges won&apos;t match up. It &apos;tiles&apos; horribly when we replicate the image many times. These always will. Let&apos;s move the &apos;seams&apos; to the middle so we can see it better.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/0b83c4c4-1504-4638-acab-53380fb3d7a2.png" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="389" height="389"><figcaption>Seams reorganised to the middle</figcaption></figure><p>All we&apos;ve done here is move the outside in, by chopping the image into 4 and swapping the positions. Now we can see the seam in the middle as if we were looking at 4 tiled images. It&apos;s not the worst seam ever, DALL-E-2 has done a very good job, but we can easily tell it&apos;s not right.</p><p>The next step we can do is to create an alpha mask on this image as we know where the seam is, it&apos;s in a nice cross shape in the centre.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/7bcff35f-8758-4c15-90a7-bf6197faa669.png" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="389" height="389"><figcaption>Alpha mask for the seams</figcaption></figure><p>This is a representation of the mask, with areas to keep showing through (in blue_ and areas to re-do in purple. So what can we do with this? Well the great thing about the DALL-E-2 API, is that we can also edit images. Sending the original image with the seams in the middle off with this mask and the original prompt will allow us to regenerate the seam area for hopefully a lovely smooth transition.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/Output_texture.png" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="1024" height="1024" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/11/Output_texture.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/11/Output_texture.png 1000w, https://fergusblog.azurewebsites.net/content/images/2022/11/Output_texture.png 1024w" sizes="(min-width: 720px) 720px"><figcaption>The second output from DALL-E-2, a now seamless texture</figcaption></figure><p>And there we have it. You can clearly see some new artefacts like the blue smudge on the brick in the middle, where the mask was. DALL-E-2 has regenerated the gaps we marked out, so now the image is pretty much seamless and tile-able. Great!</p><h2 id="lets-make-it-3d">Let&apos;s make it 3D</h2><p>For this to really be useable it doesn&apos;t just need to be seamless, it needs to have depth, roughness, metallic layers. All sorts of additional information about how light would interact with this texture in the real world, so we can better simulate it in a render engine. I previously thought this would be a manual and labour intensive process until i found <a href="https://boundingboxsoftware.com/materialize/">Materialzie</a> from Bounding Box Software. This FREE tool takes 2D flat textures from a photo, or in our case AI, and makes them pop by generating the other information we need.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/Screenshot-2022-11-04-at-16.31.18.png" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="2000" height="414" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/11/Screenshot-2022-11-04-at-16.31.18.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/11/Screenshot-2022-11-04-at-16.31.18.png 1000w, https://fergusblog.azurewebsites.net/content/images/size/w1600/2022/11/Screenshot-2022-11-04-at-16.31.18.png 1600w, https://fergusblog.azurewebsites.net/content/images/2022/11/Screenshot-2022-11-04-at-16.31.18.png 2202w" sizes="(min-width: 720px) 720px"><figcaption>Some of the layers made using Materialize</figcaption></figure><p>This is a huge improvement on our flat tile-able image. Let&apos;s have a look at what that looks like rendered.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/Yellow-Brick-Cube.jpg" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="1445" height="1273" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/11/Yellow-Brick-Cube.jpg 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/11/Yellow-Brick-Cube.jpg 1000w, https://fergusblog.azurewebsites.net/content/images/2022/11/Yellow-Brick-Cube.jpg 1445w" sizes="(min-width: 720px) 720px"><figcaption>The now 3D texture rendered with lighting and depth</figcaption></figure><p>And there we go! A now usable seamless 3D texture, completely AI generated. This opens up a word of opportunities for amazing new on-demand texture rendering!</p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">The other nice thing about using the API is that the images don&apos;t seem to include the usual watermarking in the corner</div></div><p>Here&apos;s some more example outputs just for fun:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/3eea4cd5-2316-45b1-acf4-4686ca56dbed.png" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="1570" height="498" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/11/3eea4cd5-2316-45b1-acf4-4686ca56dbed.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/11/3eea4cd5-2316-45b1-acf4-4686ca56dbed.png 1000w, https://fergusblog.azurewebsites.net/content/images/2022/11/3eea4cd5-2316-45b1-acf4-4686ca56dbed.png 1570w" sizes="(min-width: 720px) 720px"><figcaption>White fur</figcaption></figure><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/70007d2d-8e5b-4703-930d-32b6c7891f3c.png" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="1570" height="498" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/11/70007d2d-8e5b-4703-930d-32b6c7891f3c.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/11/70007d2d-8e5b-4703-930d-32b6c7891f3c.png 1000w, https://fergusblog.azurewebsites.net/content/images/2022/11/70007d2d-8e5b-4703-930d-32b6c7891f3c.png 1570w" sizes="(min-width: 720px) 720px"><figcaption>An Alien rocky planet surface</figcaption></figure><p>Notice we may want to do some colour stabilisation accross the segments on these in the future.</p><figure class="kg-card kg-image-card"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/9238e4bf-e371-4136-8dd9-ced98cd7f9a0.png" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="1570" height="498" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/11/9238e4bf-e371-4136-8dd9-ced98cd7f9a0.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/11/9238e4bf-e371-4136-8dd9-ced98cd7f9a0.png 1000w, https://fergusblog.azurewebsites.net/content/images/2022/11/9238e4bf-e371-4136-8dd9-ced98cd7f9a0.png 1570w" sizes="(min-width: 720px) 720px"></figure><p>A good example of how things with regular patterns or directionality can go very wrong very quickly...</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fergusblog.azurewebsites.net/content/images/2022/11/97c61a59-95c0-4c50-9c05-391c7879ccd5.png" class="kg-image" alt="AI Generated Textures - now in full" loading="lazy" width="1570" height="498" srcset="https://fergusblog.azurewebsites.net/content/images/size/w600/2022/11/97c61a59-95c0-4c50-9c05-391c7879ccd5.png 600w, https://fergusblog.azurewebsites.net/content/images/size/w1000/2022/11/97c61a59-95c0-4c50-9c05-391c7879ccd5.png 1000w, https://fergusblog.azurewebsites.net/content/images/2022/11/97c61a59-95c0-4c50-9c05-391c7879ccd5.png 1570w" sizes="(min-width: 720px) 720px"><figcaption>Grass</figcaption></figure>]]></content:encoded></item></channel></rss>