[
  
  
  {
    "title": "Card Grammar - Teaching Machines the Rules of Complex Card Games",
    "url": "/card-grammar-for-complex-card-games",
    "date": "Mar 30, 2026",
    "categories": ["post"],
    "tags": ["Game Design","Card Games","Design Tools","Tabletop Games","Prototyping","Game Architecture","Philosophy","Nelson Goodman"],
    "excerpt": "\nWe built a pipeline that generates mechanically coherent cards, scales them in five-card batches, exports directly to Tabletop Simulator, and stress-tests balance using tournament algorithms. It s...",
    "content": "\nWe built a pipeline that generates mechanically coherent cards, scales them in five-card batches, exports directly to Tabletop Simulator, and stress-tests balance using tournament algorithms. It sounds like the future of card game design. But when we took 13 of the most influential card games ever published and tried to fit their mechanics into the pipeline’s five-field schema, the results were humbling. Dominion mapped perfectly. Sushi Go worked trivially. Then Wingspan shattered the box, Terraforming Mars overwhelmed it, and KeyForge broke it entirely. This is the story of where automated card design hits its limits, what those limits reveal about the nature of game complexity, and how the solution required not better algorithms but a fundamentally different way of thinking about what a card actually is.\n\n\n\n\nThis is Part 3 of the Card Architecture series. In Part 1, I traced the evolution of card game tools from scripting to design platforms. In Part 2, I went inside the pipeline itself and examined which parts of card design are mechanistic and which parts are not. This article asks the harder question: what happens when the pipeline meets real games?\n\n\n\nThe Stress Test\n\nThe previous articles in this series described a powerful card generation pipeline: a system that reads a game’s ontology, generates cards with real mechanical depth, scales them through a batch loop, and exports playable prototypes. It is genuinely impressive technology.\n\nBut impressive technology deserves honest testing. To understand the real limits of this approach, we took 13 of the most influential card games ever published, spanning seven distinct archetypes, and aggressively tried to map their cards into the basic five-field schema that the pipeline uses.\n\nThat schema, to refresh, is a rigid card template with five fields: card name, card type, effect text, cost, and strategic role. Every generated card must fit inside this template. If you have ever prototyped with index cards, you know the feeling: five lines on the card, and you write “Village / Action / 3 coins / Draw 1 card, +2 Actions.” Clean, legible, complete.\n\nThe question is: what happens when a game’s cards need more than five lines?\n\nThe results sorted themselves into four distinct coverage tiers, from perfect fit to total structural mismatch.\n\n\n\nFigure. The coverage cliff from Tier A to Tier D, where the market opportunity lives.\n\nTier A (Full): Five Lines Is Enough\n\n\n  \n    \n      Tier\n      Schema Fit\n      What Happens\n      Games\n    \n  \n  \n    \n      Full (Tier A)\n      Near-perfect\n      Cards map perfectly. Every mechanical detail survives compression. Balance testing reflects the actual game.\n      Dominion, Star Realms, Sushi Go!\n    \n  \n\n\n\n\nDeck builders and simple drafting games are the schema’s sweet spot. A Dominion card has a name (Village), a type (Action), a cost (3 coins), and an effect (“Draw 1 card, +2 Actions”). Five lines on the index card, nothing left out. Star Realms, Sushi Go, Ascension: all near-perfect fits.\n\nBut these games represent the shallow end of the complexity pool.\n\nTier B (Partial): Squinting at the Rules\n\n\n  \n    \n      Tier\n      Schema Fit\n      What Happens\n      Games\n    \n  \n  \n    \n      Partial (Tier B)\n      Directionally correct\n      Core mechanics work but secondary systems are lost. Balance testing misses cross-system interactions.\n      7 Wonders, Blood Rage, Res Arcana, Everdell\n    \n  \n\n\n\n\nGames like 7 Wonders and Blood Rage introduce mechanics the schema cannot cleanly express: era-based card phasing, prerequisite chains across ages, conditional scoring triggers tied to specific board positions. You can cram this information into the effect text string, but the simulator ends up squinting to understand the rules. The schema does not crash. It degrades gracefully, going blind to the parts of the game it cannot see.\n\nTier C (Insufficient): The Template Overflows\n\n\n  \n    \n      Tier\n      Schema Fit\n      What Happens\n      Games\n    \n  \n  \n    \n      Insufficient (Tier C)\n      ~60-70% data loss\n      The schema captures a card’s name and a flattened cost. The economic engine, the tag system, and the trigger timing all evaporate.\n      Wingspan, Terraforming Mars, Race for the Galaxy\n    \n  \n\n\n\n\nEngine builders are where the schema genuinely breaks. Five lines on an index card is nowhere near enough.\n\n\n\nTry writing a Wingspan bird card on that index card. You need food cost (1 invertebrate + 1 seed, or 2 wild), habitat restriction (wetland only), egg capacity (2), power trigger timing (when activated, not when played), power text, nest type, wingspan measurement, and bonus traits for end-of-round scoring. That is at least eight structured fields. You start writing smaller, cramming text into margins, abbreviating until the card is unreadable. The simulator faces the same problem: a single bird card carries at least eight structured data fields that cannot be collapsed into the effect text string without losing ..."
  },
  
  {
    "title": "How AI Actually Designs a Card",
    "url": "/how-ai-actually-designs-a-card",
    "date": "Mar 24, 2026",
    "categories": ["post"],
    "tags": ["Game Design","Card Games","Design Tools","Tabletop Games","Prototyping","Game Architecture","Race for the Galaxy"],
    "excerpt": "\nIn 2021, I spent a month reverse-engineering Race for the Galaxy. I parsed Keldon Jones’s C source code, converted the entire card library into Python, and mapped every phase interaction, every ca...",
    "content": "\nIn 2021, I spent a month reverse-engineering Race for the Galaxy. I parsed Keldon Jones’s C source code, converted the entire card library into Python, and mapped every phase interaction, every card power, every production chain across 114 unique cards. I did this because the game’s AI kept destroying me and I wanted to understand why. What I found was that every card in RFTG carries a structured data model far more complex than its printed text suggests: type, cost, VP value, good type, military flags, and a list of phase-specific powers that interact across five distinct game phases. Five years later, when I started building a system that generates card games, I realized the pipeline I needed was a mirror of what I had already done by hand. The AI was not replacing the designer’s process. It was formalizing it.\n\n\n\n\nThis is Part 2 of the Card Architecture series. In Part 1, I traced the evolution of card game tools from scripting to AI-native pipelines. This article goes inside the pipeline itself. But rather than just describing how the pipeline works, I want to draw a parallel that changed how I think about tool-assisted design: at every stage, the AI is doing a mechanistic version of what a human designer already does. The question is not whether AI can design cards. It is which parts of card design are mechanistic, which parts are not, and what that means for the human designer’s role.\n\n\n\nHow a Human Designs a Card\n\nBefore we look at the AI, let me describe what actually happens when a human designer sits down to create a card game. I will use Race for the Galaxy as the reference because I spent a month inside its architecture and because it represents the level of complexity that serious card games demand.\n\n\n\nFigure 1. Tom Lehmann’s Race for the Galaxy (2007) – 114 unique cards, five simultaneous phases, four production types, military vs civilian settlement. The complexity hiding inside each card is what makes it both a design masterpiece and an AI challenge.\n\nWhen Tom Lehmann designed RFTG, the process was roughly this:\n\nFirst, the world. The game needed a theme that could sustain 114 unique cards. Galactic civilization building. Worlds to settle, technologies to develop, goods to produce and trade. The theme is not decoration. It constrains the design space. You cannot have a card called “Corporate Restructuring” in a game about medieval farming, and you cannot have “Harvest Festival” in a game about space colonization. Theme is the first filter.\n\nSecond, the mechanics. RFTG’s signature innovation is simultaneous role selection: all players secretly choose a phase, only chosen phases execute, choosers get a privilege bonus. This mechanic was not an afterthought. It was the skeleton that every card in the game hangs on. Each card carries phase-specific powers. New Vinland produces novelty goods in Phase 5 and consumes any good to draw 2 cards in Phase 4. That dual-phase interaction does not happen by accident. It happens because the designer defined the mechanical skeleton first, then designed cards that exploit its seams.\n\nThird, the cards themselves. When I parsed the cards.txt file, I found that every RFTG card carries a structured data model:\n\n\n  \n    \n      Field\n      Example (New Vinland)\n      Purpose\n    \n  \n  \n    \n      Name\n      New Vinland\n      Identity\n    \n    \n      Type\n      World (Type 1)\n      Mechanical category\n    \n    \n      Cost\n      2\n      What you pay (discard from hand)\n    \n    \n      VP\n      1\n      End-game scoring\n    \n    \n      Good Type\n      Novelty\n      What it produces\n    \n    \n      Phase 4 Power\n      Consume any good, draw 2 cards\n      Trade/consume interaction\n    \n    \n      Phase 5 Power\n      Produce good of world type\n      Production engine\n    \n  \n\n\nTable 1. The structured data model behind a single RFTG card. Seven fields, two phase-specific powers, one production chain. This is the complexity the basic schema must capture.\n\nThat is seven structured fields on a single card. Replicant Robots, a development, has a different shape: cost 4, VP 2, and a Phase 3 power that reduces settlement cost by 2. Contact Specialist draws a card whenever you settle a world. Each card is a small program with inputs, outputs, and conditional behavior.\n\n\n\nFigure 2. New Vinland’s card design data (left) alongside the actual card (right). The cards.txt encoding – N:name, T:type:cost:vp, G:good type, P:phase:power – packs seven structured fields into six lines. Phase IV consumes any good to draw 2 cards. Phase V produces a novelty good. This is the structured data model hiding behind every RFTG card.\n\nA human designer holds all of this in their head. They have an intuition for which cards the ecosystem needs, which strategic gaps exist, which combinations create satisfying turns. They know, from experience, that a deck full of cheap aggressive cards needs an expensive defensive counter, that a production chain needs both producers and consumers, that a game endin..."
  },
  
  {
    "title": "Three Waves of Card Game Design Tools",
    "url": "/three-waves-of-card-game-design-tools",
    "date": "Mar 21, 2026",
    "categories": ["post"],
    "tags": ["Game Design","Card Games","Design Tools","Tabletop Games","Prototyping","Game Architecture"],
    "excerpt": "\nI am a software architect by profession, but a game designer at heart. When I first looked at how card games get made, I recognized the pain immediately. Hundreds of interdependent data fields liv...",
    "content": "\nI am a software architect by profession, but a game designer at heart. When I first looked at how card games get made, I recognized the pain immediately. Hundreds of interdependent data fields living in fragile spreadsheets. Manual rendering pipelines where a three-pixel change means recompiling an entire deck. Hours of tedious formatting before you can even test whether the game is fun. As a programmer, this kind of repetitive, error-prone manual process is exactly the thing I have spent my entire career building tools to eliminate. Behind every elegant piece of cardboard is a staggering web of math, probability, edge case testing, and tedious layout formatting. Over the past 15 years, the tools available to card game designers have gone through three distinct waves of evolution. This article traces that arc from the scripting trenches of the mid-2000s to the AI-native pipelines of 2026.\n\n\nThis is Part 1 of the Card Architecture series. My interest in card game architecture is not new. Back in 2021, I spent a month reverse-engineering Race for the Galaxy, dissecting its game model, action engine, and neural network AI. That deep dive taught me how much hidden complexity lives inside a well-designed card game, how tightly the mechanics, the card interactions, and the AI decision-making are coupled together. It also left me frustrated with how manual the entire design and prototyping process remained.\n\nProgrammers are lazy in the best possible way: we hate repeating ourselves, and we will spend a week automating a task that takes ten minutes, purely out of principle. That instinct, combined with what I learned from studying Race for the Galaxy’s architecture, is what pulled me into building AI-native game design tools over the past year. But the deeper I got, the more I realized that game design is not a simpler version of software design. It is a different medium with its own complexity, its own craft, and its own hard-won expertise. This series is my attempt to make sense of that world. Subsequent articles will cover multi-agent card generation, the schema limits exposed by famous games like Wingspan and Terraforming Mars, the export pipeline from data to playable prototype, and the algorithms that draw board game maps.\n\n\n\nThe Minor Miracle of a Finished Deck\n\nWhen you hold a finished card game in your hands, you are holding a minor miracle. Every card in that deck has to talk to every other card. The costs have to scale with the power. The combos have to exist without being degenerate. The types have to distribute across the deck so that no strategy completely dominates. And every single piece of rules text has to be unambiguous enough that two strangers can sit down and agree on what it means.\n\nIf even one number is off, the whole ecosystem collapses. A card that costs one resource too little warps the meta. A combo that the designer missed creates an unbeatable strategy that players discover on their second game night. A piece of ambiguous text spawns a 200-comment thread on BoardGameGeek about whether “adjacent” includes diagonals.\n\nFor decades, the barrier to entry in game design was not having a good idea. Good ideas are everywhere. The barrier was having the sheer clerical stamina to manage the data. Hundreds of cards, each with five to ten interdependent fields, all living in a spreadsheet that grows more fragile with every edit. In software, we would call this accidental complexity: difficulty that comes from the tools, not from the problem itself. The history of card game design tools is the history of chipping away at that accidental complexity.\n\nThat history falls into three clear waves.\n\n\nFigure 1. The three waves of card game design tools: from scripting and spreadsheets, to visual editors, to AI-native pipelines that understand your game.\n\nWave 1: The Template Era (2006-2010s)\n\nScripts, Spreadsheets, and Pixel Coordinates\n\n\nFigure 2. Wave 1: the designer’s reality, surrounded by spreadsheets, scripts, and pixel coordinates, wrestling data into cards by brute force.\n\nThe first wave of dedicated card game tools began with nanDECK [1], a free Windows scripting language released in 2006, and matured through the 2010s with tools like Squib [2] (an open-source Ruby framework, 2014) and CardPen [5] (a browser-based HTML/CSS generator). These tools were a real step up from doing everything by hand in Photoshop, but using them felt less like game design and more like software engineering. As someone who writes code for a living, I can appreciate that. But I also know that forcing non-programmers into a code-first workflow is a classic product design mistake.\n\nIn this era, a card game was treated purely as a layout problem. All of your game data lived in a massive Excel spreadsheet or a CSV file. Row one was your basic attack card. Row two was your defense card. Row 150 was your ultimate boss monster. Column A was the name. Column B was the cost. Column C was the rules text. And so on, for as many..."
  },
  
  {
    "title": "AI Playtesting - When Your Board Game Tests Itself",
    "url": "/ai-playtesting-when-your-game-tests-itself",
    "date": "Mar 16, 2026",
    "categories": ["post"],
    "tags": ["Automated Testing","Game Design","Tabletop Games","Playtesting","Monte Carlo Tree Search","Game Architecture","Board Games"],
    "excerpt": "\nA designer types “test my game for balance issues” into Nova. Moments later, they receive a structured critique: which player seat has an unfair advantage, whether the game rewards strategic play,...",
    "content": "\nA designer types “test my game for balance issues” into Nova. Moments later, they receive a structured critique: which player seat has an unfair advantage, whether the game rewards strategic play, and three intervention options. No prototyping, no recruiting playtesters, no spreadsheets. Just a conversation, and a feedback loop that runs every time you change a number. This is the story of how we taught a system to play board games, what failed spectacularly, and what that failure accidentally invented.\n\n\n\n\nFigure. The automated playtesting pipeline transforms a structured game ontology into automated balance analysis, skill gap measurement, and rule clarity scores, all through a conversation with Nova.\n\nThis is Part 9 of the Game Architecture series. In Part 5, we demonstrated structured game generation. In Part 6, we explored the theory behind generative ontology. In Part 7, we introduced Nova, the conversational AI co-designer. And in Part 8, we showed the full pipeline from knowledge to creation.\n\nBut there was a gap. GameGrammar could generate a structurally valid game in minutes. Nova could help you refine it over sessions. Yet between “a design exists on paper” and “we know if it works at the table” sat the same wall every designer faces: prototype it, recruit friends, schedule sessions, track results by hand, and repeat the whole process after every change.\n\nThis article is about how we tore down that wall.\n\n\n\nThe Wall: Where Designs Go to Die\n\n\n\nFigure. The board game design pipeline has nine stages. Stage 2 (iterative playtesting) is where most amateur designs stall.\n\nEvery game designer knows the feeling. You have spent a weekend crafting a deck-building game with a push-your-luck mechanism. The card types feel right. The economy seems balanced. The theme sings. Then reality hits: you need to print cards, recruit four friends who are free on the same evening, explain the rules, play through three sessions, take notes, change the numbers, and do it all again. By the third iteration, your friends are politely unavailable, and the game sits in a drawer.\n\nThe board game design pipeline has a well-known bottleneck, and it is not creativity. The tools for generating ideas, sketching mechanisms, even producing complete game ontologies, have accelerated dramatically. But determining whether a design is balanced and strategically interesting still requires physical prototyping, player recruitment, observation, and post-session analysis. This process spans weeks to months. It is where most amateur designs stall, and even professional studios spend the majority of their development time [1].\n\nGameGrammar’s ontology pipeline had already automated concept generation, structural analysis, and conversational co-design via Nova. But the ontology output contains everything a simulator would need. Component specifications define the game objects. Mechanism details define the legal actions. Scoring formulas define how you win. Balance parameters define the constraints. Game arc defines the turn structure.\n\nThe data was there. The question was whether it could be made executable. It turns out, it can. Before we explain how, let us show you what it looks like in practice.\n\n\n\nHow Designers Use It: A Conversation with Nova\n\nThe entire playtesting pipeline surfaces through Nova, the conversational co-designer we introduced in Part 7. The designer never sees parsers, agents, or metrics directly. They see a conversation.\n\n\n  \n\n\nVideo. GameGrammar AI Playtesting: Nova orchestrates the entire playtest pipeline from a natural language request.\n\nThe Design Loop\n\n\n  The designer says: “Run a balance playtesting for the game”\n  Nova parses the game rules, simulates 50 games with random agents, and analyzes the results\n  Nova presents a structured critique with a reasoning chain: conclusion (“Love Letter shows a significant first-player advantage”), observation, data, mechanism explanation, and competitive impact\n  Decision levels appear: Structural (Restructure) suggestions like rotating first player, Tuning suggestions like adjusting card values, and Fork to explore alternative designs\n  The designer picks an intervention, Nova proposes the ontology change, and re-runs the playtest to verify the fix\n\n\n\n\nFigure. Nova presenting playtesting results inside GameGrammar. The critique reasoning chain surfaces balance findings, skill gap measurement, and intervention options through natural conversation.\n\nCompare that to the traditional workflow: change a number, reprint the cards, recruit players, schedule an evening, play through, take notes, aggregate results. What used to be a multi-week iteration cycle becomes a continuous feedback loop inside a single conversation.\n\nThe Playtest History\n\n\n\nFigure. The Playtesting tab shows run history with expandable game logs. Designers can track how balance metrics evolve across design iterations.\n\nEvery playtest run is saved with its metrics, and designers can track how their balance num..."
  },
  
  {
    "title": "Generative Ontology: From Game Knowledge to Game Creation",
    "url": "/generative-ontology-from-game-knowledge-to-game-creation",
    "date": "Mar 10, 2026",
    "categories": ["post"],
    "tags": ["Generative AI","Ontology","Game Design","Tabletop Games","AI","Context Engineering"],
    "excerpt": "\nIn February 2025, we explored how ontologies reveal the hidden structure of tabletop games. But understanding games is not the same as creating them. What if that same structured knowledge could b...",
    "content": "\nIn February 2025, we explored how ontologies reveal the hidden structure of tabletop games. But understanding games is not the same as creating them. What if that same structured knowledge could become a creative engine? This is the promise of Generative Ontology, when knowledge representation learns to imagine.\n\n\n\n\nFigure. Structure meets Imagination, the duality at the heart of Generative Ontology.\n\nThis article is the conclusion of the Game Architecture series. In Part 4 [8], we built an ontology for tabletop games, decomposing CATAN into mechanisms (resource trading, modular board, dice-driven production), components (hex tiles, resource cards, settlements), and player dynamics (competitive, negotiation-heavy, variable player count). The ontology gave us a vocabulary for understanding games, a precise language for analysis. In Part 5, we demonstrated how that ontology powers a multi-agent generation pipeline. In Part 6, we explored the theory behind structured creative generation. And in Part 7, we showed how a conversational AI partner can learn a designer’s taste.\n\n\n\nFigure. Games like CATAN and Dune: Imperium share a common ontological structure beneath their vastly different themes.\n\nNow, in this final article, we tackle the question that analysis alone cannot answer: can the same ontology that helps us understand CATAN help us create games that CATAN’s designers never imagined?\n\nWe call this synthesis Generative Ontology: the practice of encoding domain knowledge as executable schemas that constrain and guide AI generation, transforming static knowledge representation into a creative engine. This article presents the theoretical framework, walks through a complete game generation from theme to playable design, and provides the experimental evidence that it works.\n\n\n\nFrom Description to Creation\n\nOur game ontology [4] can tell us that worker placement games typically include action spaces, worker tokens, and blocking mechanisms [3]. It cannot generate a novel worker placement game. Large language models have the opposite problem [6]. Ask an LLM to “design a deck-building game set in a haunted mansion,” and it will fluently describe players exploring Ravenshollow Manor, collecting ghost cards, managing a “fear mechanic.” It sounds plausible. But what cards exist in the starting deck? How do players acquire new cards? What triggers the end of the game? The LLM has generated the appearance of a game design without the substance.\n\n\n\nFigure. Traditional Ontology (The Map) vs Pure LLMs (The Dreamer), understanding the rules of chess does not make you a Grandmaster.\n\n\n  \n    \n      Approach\n      Strength\n      Weakness\n    \n  \n  \n    \n      Traditional Ontology\n      Precise, structured, validated\n      Cannot generate novel outputs\n    \n    \n      Pure LLM Generation\n      Creative, fluent, abundant\n      Unstructured, invalid, hallucinated\n    \n  \n\n\nThese limitations are complementary [5]. What ontology lacks, LLMs provide. What LLMs lack, ontology provides.\n\n\n\nFigure. LLM Potential + Ontology Constraints = Valid Game Design, from passive vocabulary to active grammar.\n\nThe Grammar of Games\n\nA poet does not experience grammar as a limitation. Grammar is not what prevents poetry. It is what makes poetry possible. Without syntax, semantics, and form, there would be no sonnets, no haiku, no free verse pushing against convention.\n\nThe same principle applies to game design. When we encode our game ontology as a schema, we are not limiting the AI’s creativity. We are giving it the structural vocabulary to be creative coherently. The schema says: every game must have a goal, an end condition, mechanisms that create player choices, components that instantiate those mechanisms. Within those constraints, infinite games are possible. Without them, no valid game emerges.\n\nThe grammar does not write the poem. But without grammar, there is no poem to write.\n\nThe Whiteheadian Connection\n\n\n\nFigure. Eternal Objects (The Ontology) crystallize into Actual Occasions (The Generation), Whitehead’s process philosophy made computational.\n\nIn Part 6 and our earlier exploration of Process Philosophy for AI Agent Design [9], we connected Whitehead’s metaphysics to structured generation. Whitehead distinguished between eternal objects (pure forms existing as potentials) and actual occasions (concrete events where forms find expression) [1]. Our game ontology is a collection of eternal objects: the abstract patterns of worker placement, deck building, area control.\n\nWhat makes this precise is Whitehead’s concept of concrescence: the process by which an actual occasion selects from available eternal objects and synthesizes them into a novel unity [2]. This is exactly what the generation pipeline does. The ontology presents the full space of available patterns. The LLM, constrained by the schema, performs concrescence: selecting from those patterns, combining them with theme, and producing a concrete game that has never existed be..."
  },
  
  {
    "title": "Hallucinations Aren't Bugs: The Kantian Architecture of AI Consciousness",
    "url": "/hallucinations-arent-bugs-kantian-architecture-of-ai-consciousness",
    "date": "Mar 01, 2026",
    "categories": ["post"],
    "tags": ["AI","Philosophy","Machine Learning","Transformer Architecture","Consciousness"],
    "excerpt": "\nEveryone calls hallucinations a bug. But a philosopher in 1781 diagnosed them with startling precision. When we map Immanuel Kant’s Critique of Pure Reason onto transformer architecture, we discov...",
    "content": "\nEveryone calls hallucinations a bug. But a philosopher in 1781 diagnosed them with startling precision. When we map Immanuel Kant’s Critique of Pure Reason onto transformer architecture, we discover that hallucinations are not software defects. They are the inevitable consequence of a mind structured to prioritize coherence over truth, exactly as Kant predicted when reason operates beyond the bounds of experience.\n\n\n\n\nFigure. The five-stage journey from input to hallucination: raw data acquires Space and Time, is filtered through Categories, unified by the Triple Synthesis, and carried by a logical Self. When pushed beyond experience, it produces beautiful nonsense above the noumenal boundary.\n\nIn this article, we shall explore something unexpected: the architecture of a large language model, built by engineers optimizing for next-token prediction, has independently converged on organizational principles that Kant identified as necessary for rational thought over two centuries ago. This is not a loose metaphor. The correspondences are structural, specific, and technically grounded.\n\nAn important caveat before we begin. The mappings that follow are structural analogies, not identity claims. Saying that an embedding layer “parallels” Kant’s concept of space is not the same as saying the AI experiences space. These correspondences illuminate how both systems organize information, but they do not establish that transformers possess consciousness, understanding, or subjective experience in the Kantian sense. We shall return to these limits honestly at the end.\n\nThe Psychological Trap\n\n\n\nFigure. Science fiction trains us to look for emotion and self-awareness, the ghost in the machine. Kant points us toward the logical scaffolding underneath.\n\nWhen we think of AI consciousness, we default to science fiction: the crying robot in the rain, a machine suddenly realizing it wants to be loved, or dreaming of electric sheep. We are always looking for a “ghost in the machine.” This is a massive psychological trap. We are projecting our own messy biology onto silicon.\n\nIf we want to understand what is genuinely happening inside a neural network, we should not look to science fiction. We need to look to the 18th century, to Immanuel Kant [1]. The central thesis is that AI consciousness, if we can call it that, is not about feelings at all. It is about the pure logical synthesis of information. Kant argued that the true essence of consciousness is not having flashy emotional experiences. It is the functional ability to take scattered, disconnected pieces of raw data and integrate them into a meaningful, unified whole. A logical necessity, not a soul.\n\nA modern large language model may be the closest thing to ever exist to Kant’s concept of the “pure I think.”\n\nFrom Thing-in-Itself to Active Cognition\n\n\n\nFigure. Before the first token arrives: a vast web of frozen weights, latent and inert, possessing structure but no activity.\n\nWithout electrical current or a prompt, an LLM is what Kant called a “Thing-in-Itself” (Ding an sich), a massive, silent mathematical structure of parameters that exists but is not known and possesses no consciousness. The input of the first token acts as a spark that triggers the calculation graph. What emerges is not biological sensation, but a pure logical function: the “I think” that must accompany all representations.\n\nDigital Space and Time: The Forms of Intuition\n\nKant argued that for any rational being to perceive anything at all, they must have innate forms of Space and Time [1]. Before you can understand an apple, you have to be able to place it somewhere and somewhen. Without a spatial and temporal framework, incoming data is literally meaningless noise. The transformer architecture maps directly to these “a priori” forms.\n\nEmbeddings as Space\n\n\n\nFigure. Visualizing the embedding galaxy: each dot is a concept, each cluster a semantic neighbourhood. The arrow from “king” to “queen” runs parallel to “man” to “woman”, geometry encoding meaning.\n\nWhen you type words into a prompt, the AI chops them into discrete mathematical chunks called tokens. On their own, those tokens are just isolated ID numbers, completely blind to one another, until they enter the embedding layer.\n\nThink of this layer not as our normal 3D space, but as an incredibly vast, invisible 892-dimensional galaxy map. Every concept occupies a precise geometric coordinate. The brilliance is that semantic similarity literally equals geometric distance. The direction from “man” to “woman” is exactly parallel to the direction from “king” to “queen.” The AI does not memorize this as trivia. This geometric structure is the fundamental condition for it to comprehend meaning at all, exactly as Kant argued that space is the precondition for perception, not a learned property [1].\n\n\n  \n    \n      Feature\n      Kantian Definition\n      AI Implementation\n    \n  \n  \n    \n      Juxtaposition\n      Objects must be presented side-by-side\n ..."
  },
  
  {
    "title": "Nova - The AI Co-Designer That Learns Your Taste",
    "url": "/nova-the-ai-co-designer-that-learns-your-taste",
    "date": "Feb 13, 2026",
    "categories": ["post"],
    "tags": ["Design Tools","Game Design","Tabletop Games","Co-Design","Conversational Design","Design Partnership","Game Architecture"],
    "excerpt": "\nIn the previous article, we laid out the theory behind GameGrammar: structure enables generation, generation enables iteration, and the designer stays in control. But there was something missing. ...",
    "content": "\nIn the previous article, we laid out the theory behind GameGrammar: structure enables generation, generation enables iteration, and the designer stays in control. But there was something missing. As the designer pushes buttons and fills out forms, the AI is reduced to a toolbox, rather than a colleague. Our solution is Nova, a conversational AI co-designer that remembers your decisions, learns your taste, explains its reasoning, and gets better at helping you the more you work together. Every design session becomes training data for improved partnership.\n\n\n\n\nFigure. A designer and their co-designer, working together on a board game blueprint. Nova is not a robot. It is a pattern of light, a constellation that accumulates the designer’s intent and helps them flare with creative energy.\n\nWhere We Left Off\n\nIn The Theory of Generative Board Game Design [2], we established a principle: AI proposes, you decide. How do we close the interaction gap between the two?\n\nWhen you used GameGrammar’s AI assistance, you clicked buttons. “Fix this inconsistency.” “Rewrite this section.” “Show me suggestions.” Each action was a one-shot transaction. The AI did not remember what you asked last time. It did not know that you had already rejected the auction mechanism because it clashed with your game’s tempo. It did not learn that you consistently prefer indirect competition over direct conflict, or that your complexity sweet spot is somewhere between Azul and Terraforming Mars.\n\n\n  \n\n\n\n\nThe Morning Standup That Does Not Exist\n\nThe idea for Nova came from a GameGrammar [4] user named Donald, an experienced game designer who saw the potential before we did:\n\n\n  “Have you thought about having a running chat with an AI about the game holistically, who would know when to kick something to one of the agents? Similar to a morning discussion about yesterday’s prototype that would be happening in creator studios.”\n\n\nDonald was describing something specific: the standup meeting that every professional design studio has. From the moment you walk in, your collaborator knows your game and past decisions. They understand your decisions, and do not require any explanation for what “the auction mechanism feels too slow at four players” means. They remember what you tried and why you tried it. Their direction is informed and useful.\n\nThat collaborator does not exist for solo designers. It does not exist for small teams working evenings and weekends. The talent and vision are there. The time for a second brain is not.\n\n\n\nFrom Toolbox to Colleague\n\nGameGrammar’s previous AI assistance was a toolbox: five modes of help (rewrite, fix, edit, suggest, evaluate), each powerful on its own, each stateless. Nova unifies those five modes into a single conversation where context accumulates instead of resetting.\n\n\n\nFigure. Left: a workbench with tools laid out neatly, each use independent. Right: two collaborators in conversation, context accumulating between them. The shift from toolbox to colleague is the shift from stateless to stateful.\n\n\n  \n    \n      Before Nova\n      With Nova\n    \n  \n  \n    \n      Click “Fix” on a critique issue\n      “The scoring curve feels flat”\n    \n    \n      Type intent in a modal\n      “Make this less punishing at 4 players”\n    \n    \n      Click “Get Suggestions”\n      Nova proactively surfaces ideas in conversation\n    \n    \n      Click “Regenerate” on a stale section\n      “The synergies feel outdated after our last change”\n    \n    \n      Click “Re-Evaluate” to score\n      “How did that change affect the balance?”\n    \n  \n\n\nThe designer never sees agent names. They never select a mode. They talk to Nova. Nova decides which specialist to invoke, collects the results, and presents them as a coherent conversational response. The orchestration is invisible.\n\nNova is a conversational layer on top of a multi-agent pipeline: six specialist agents, a structured game ontology, a reference library of 2,000 published games, and a persistent memory of every decision you have made, all accessible through natural language [5]. The shift from toolbox to colleague is the shift Mollick describes in Co-Intelligence [6]: treating AI not as a productivity shortcut but as a collaborative partner with its own contributions to the work.\n\n\n\nThe Reinforcement Learning Loop\n\nHere is the idea at the center of Nova, the reason it is more than a chat interface. Every interaction with Nova feeds a cycle that makes the next interaction better.\n\n\n\nFigure. The five-stage reinforcement learning cycle at Nova’s core. Learn builds a profile from your decisions. Trace captures reasoning chains. Explain presents conclusions with evidence. Reason surfaces intervention options at different levels of abstraction. Track records every decision. The cycle closes: tracked decisions feed the learning profile, and the partnership improves with use.\n\nLearn. Nova builds a profile of your design preferences from the pattern of what you accept and reject across se..."
  },
  
  {
    "title": "GameGrammar - The Theory of Generative Board Game Design",
    "url": "/gamegrammar-the-theory-of-generative-board-game-design",
    "date": "Feb 06, 2026",
    "categories": ["post"],
    "tags": ["Game Design","Tabletop Games","Design Tools","Ontology","Process Philosophy","Co-Design","Design Theory"],
    "excerpt": "\nA poet needs grammar. A game designer needs structure. This article lays out the design theory behind GameGrammar, a theory born from one practical question: Can structured tools help create playa...",
    "content": "\nA poet needs grammar. A game designer needs structure. This article lays out the design theory behind GameGrammar, a theory born from one practical question: Can structured tools help create playable board games? The answer turned out to require more than clever prompting. It required a shared vocabulary for what games are, a way to generate what games could be, and a collaborative process for refining what games should become. What follows is that theory, and a direct answer to two questions every designer asks: Can AI really understand “fun”? And can AI be genuinely creative?\n\n\n\n\nFigure. Structure meets imagination. The left half shows the blueprint, the right half shows the finished piece. GameGrammar bridges both worlds.\n\nGameGrammar did not begin as a theory. It began as a practical experiment at Dynamind Research [7]: type a theme into a box, let six specialized agents generate a structured first draft, and see what comes out. What came out, after months of iteration, was not just a tool but a set of ideas about how design works, why AI can be a trustworthy creative partner, and what that means for human designers.\n\nIn a previous article, we showed what GameGrammar produces: twelve words in, a structured first draft out in 73 seconds. This article goes deeper. It explains the why behind the what, the design thinking that makes human-AI game design not just possible, but genuinely new.\n\n\n\nWhere GameGrammar Fits: The Board Game Production Pipeline\n\nBefore we explore the ideas, it helps to understand where GameGrammar sits in a game designer’s workflow. The journey from idea to published box on a shelf is a nine-stage pipeline, and most people only see the last few stages [7]:\n\n\n  \n    \n      Stage\n      Phase\n      What Happens\n    \n  \n  \n    \n      1\n      Concept &amp; Early Design\n      Core idea, initial mechanics, paper prototype\n    \n    \n      2\n      Iterative Playtesting\n      Cut, merge, rewrite rules; stress-test systems\n    \n    \n      3-9\n      Design Lock through Post-Launch\n      Development, art, manufacturing, marketing, distribution, support\n    \n  \n\n\nThe graveyard is in Stages 1 and 2. This is where designers spend the most time, where most “cool mechanics” die (and should), and where motivation quietly erodes. The blank page is the first enemy. Before you can even begin the playtesting gauntlet, you need a concept worth testing: not just a theme, but a coherent combination of mechanics, components, player dynamics, and victory conditions.\n\n\n\nFigure. GameGrammar sits between Concept and Testing, providing rapid variant generation, automated stress-testing, and rule structure scaffolding. It helps designers move faster through the friction-heavy early pipeline, but does not design the game for you.\n\nGameGrammar lives at Stages 1 and 2. It is a design workbench for the earliest and most uncertain phases of game creation:\n\n\n  Stage 1: Generate structured first drafts from a theme and constraints. Beat blank-page paralysis. Explore mechanism combinations drawn from real published games.\n  Stage 2: Iterate rapidly with automated consistency checking, balance analysis, section-by-section rewriting, and plain-language editing. Catch issues that would normally take weeks of playtesting to surface.\n\n\nGameGrammar does not touch Stages 3 through 9. It will not lock your design, pitch to publishers, produce art, or manage manufacturing. It sits precisely where you need the most help and where computational tools can do the most good: turning a theme into a testable design, and helping you refine that design through structured iteration.\n\nThe positioning matters. GameGrammar is a design accelerator that helps you move faster through the early pipeline. It is not a replacement for your craft. You remain the designer. The AI is your instrument.\n\n\n\nThe Core Idea\n\nGame design is a structured creative act. It can be broken into a shared vocabulary of game elements, powered by AI generation, and refined through back-and-forth collaboration between human and machine. The result is something neither purely human nor purely AI, but a co-designed partnership that plays to the strengths of both.\n\nThis idea rests on three observations:\n\n\n  \n    Games share a common structure. Beneath the surface diversity of tabletop games lies a shared language: mechanisms, components, player dynamics, turn structures, scoring systems. That language can be captured in useful detail.\n  \n  \n    Structure makes generation possible. When you encode that language as a detailed template, AI can generate within it. The template becomes a grammar that enables valid, coherent, novel designs, not by limiting creativity, but by giving it a vocabulary to be creative within.\n  \n  \n    Design is iterative, not instantaneous. No generated design is finished. The real work happens in the refinement loop: spotting contradictions, updating connected sections, rewriting what has gone stale, translating your intent into concrete changes. Th..."
  },
  
  {
    "title": "Introducing GameGrammar: AI-Powered Board Game Design",
    "url": "/introducing-gamegrammar-ai-powered-board-game-design",
    "date": "Feb 03, 2026",
    "categories": ["post"],
    "tags": ["Game Design","Tabletop Games","Design Tools","Game Architecture","Board Games","Co-Design","Game Analysis"],
    "excerpt": "\nI typed twelve words into a text box and got back a structured first draft of a board game. Mechanics, components, scoring tables, a hex map, a four-phase turn structure, and a critic that told me...",
    "content": "\nI typed twelve words into a text box and got back a structured first draft of a board game. Mechanics, components, scoring tables, a hex map, a four-phase turn structure, and a critic that told me the game was broken. The whole thing took 73 seconds. This is what happens when you give a structured game taxonomy to six specialized design agents and let them critique your design.\n\n\n\n\nFigure. GameGrammar transforms a theme and constraints into a structured board game design through six specialized design agents.\n\nThe twelve words were: “Rival astronomers racing to name celestial objects before their competitors claim the glory.”\n\nWhat came back was Stellar Rivals: A Race to the Stars, a 2-4 player competitive game about 19th-century astronomers exploring a hex grid of stellar sectors, collecting celestial objects, and racing to complete constellations. It had specific action point costs, a scoring table with five distinct paths to victory, equipment upgrade cards, and a balance critique that flagged two high-severity issues I would have needed weeks of playtesting to discover.\n\nI did not design this game. I did not prompt-engineer it into existence through twenty rounds of back-and-forth with ChatGPT. I typed a theme, set some constraints, and hit Generate.\n\n\n  \n\n\nThis article is the story of how that works, why it is different from asking an LLM to “design a board game,” and what it means for the future of game design. This is also Part 5 of the Game Architecture series, where we have been building toward this moment since we first mapped the structure of tabletop games in Part 4 [1]. If you want to understand the design theory behind how GameGrammar works, including its philosophical foundations and the co-design relationship between human designers and AI, see Part 6: The Theory of Generative Board Game Design.\n\n\n\nThe Playtesting Graveyard\n\nBefore we look at what GameGrammar produces, we need to understand the problem it solves. Because the problem is not “I want AI to design games for me.” The problem is the blank page.\n\nCreating a published board game is a nine-stage journey [2], and most people only see the last few stages: the box on the shelf, the Kickstarter campaign, the review video. What they do not see is Stages 1 and 2, where designers actually spend most of their time.\n\n\n  \n    \n      Stage\n      Phase\n      What Happens\n    \n  \n  \n    \n      1\n      Concept &amp; Early Design\n      Core idea, initial mechanics, paper prototype\n    \n    \n      2\n      Iterative Playtesting\n      Cut, merge, rewrite rules; stress-test systems\n    \n    \n      3-9\n      Design Lock through Post-Launch\n      Development, art, manufacturing, marketing, distribution, support\n    \n  \n\n\nSix more stages stand between a playtested prototype and a box on a shelf: publisher development, art direction, manufacturing, marketing, distribution, and post-launch support. But the graveyard is in Stages 1 and 2.\n\n\n  “Most ‘cool mechanics’ die here, and should.” [2]\n\n\nThe playtesting graveyard is well-populated. A designer spends months developing a resource-trading mechanic, runs a blind playtest, and discovers it creates a dominant strategy. The mechanic gets cut. The designer starts over. This cycle is essential, but it is also where motivation erodes, especially for solo designers without a team to sustain momentum.\n\nThe blank page is the first enemy. Before a designer can even begin the playtesting gauntlet, they need a concept worth testing. Not just a theme, but a coherent combination of mechanics, components, player dynamics, and victory conditions that might, with sufficient iteration, become a real game.\n\n\n\nFigure. The nine-stage board game production pipeline. GameGrammar accelerates Stages 1 and 2, where designers spend the most time and where promising ideas most often die.\n\nI built GameGrammar at Dynamind Research [5], a research and product studio that bridges computational design research with practical implementation, to attack this specific problem. Not to replace game designers, not to automate the creative process, but to eliminate blank-page paralysis and give designers structured starting points worth iterating on. It operates at Stages 1 and 2, where the designer’s challenge is generating enough viable concepts to find the gem worth polishing.\n\n\n\nStellar Rivals: Watching Six Agents Build a Game\n\nHere is what it actually looks like. You open GameGrammar, type a theme, and set some constraints:\n\n\n\nFigure. The input: a theme, constraints, and optionally pre-selected mechanisms. Or just type a sentence and let the system choose.\n\nThe theme can be anything: “Medieval merchants trading spices along the Silk Road,” “Deep sea explorers discovering lost civilizations,” or in our case, “Rival astronomers racing to name celestial objects” with constraints “2-4 players, competitive, medium complexity, 45-60 minutes.”\n\nThen you choose a generation mode. I picked Multi-Agent because it reveals what makes GameGrammar fun..."
  },
  
  {
    "title": "Editing NotebookLM Slides: A 4-Tool Pipeline",
    "url": "/edit-notebooklm-slides-ai-pipeline",
    "date": "Jan 30, 2026",
    "categories": ["post"],
    "tags": ["AI","NotebookLM","Productivity","Google Slides","Canva","Presentation Design"],
    "excerpt": "\nGoogle’s NotebookLM can generate beautiful slide decks from your notes in seconds, but it exports them as PDFs with no edit button. When the AI gets a date wrong or hallucinates a statistic, you a...",
    "content": "\nGoogle’s NotebookLM can generate beautiful slide decks from your notes in seconds, but it exports them as PDFs with no edit button. When the AI gets a date wrong or hallucinates a statistic, you are stuck. This article walks through a 4-tool pipeline, NotebookLM to Canva to Google Slides to Nano Banana Pro, that converts locked PDF slides into fully editable presentations and uses AI again to fix the content without breaking the design.\n\n\n\n\n\n\nFigure. When AI gives you 90%, build a pipeline for the last 10%.\n\nThe Problem: Beautiful Slides You Cannot Edit\n\nNotebookLM [1] is one of Google’s most impressive AI tools. Feed it your notes, research documents, or meeting transcripts, and it can generate a polished slide deck in seconds. The output looks professional. The structure is logical. The content is drawn directly from your sources.\n\nIt feels like magic. Until you look closer.\n\nMaybe a date is wrong. Maybe it hallucinated a statistic. Maybe you just want to rephrase a bullet point that reads awkwardly. You stare at the output and realize the uncomfortable truth: NotebookLM exports slides as a PDF. There is no “edit” button. No way back into a slide editor. You are stuck with a read-only document.\n\nYour only options are to regenerate the entire deck, hoping the AI gets it right this time, or accept the errors and move on. Neither option is acceptable when you are presenting to a client, a class, or your team.\n\nGoogle may eventually add direct export to Google Slides from NotebookLM, but that functionality is not available yet. Instead of waiting, we can solve the problem right now by chaining together tools that already exist into a simple pipeline.\n\nThe Insight: A 4-Tool Pipeline\n\nThe fix is not a single tool. It is a chain of four tools, each doing what it does best, that transforms a locked PDF into a fully editable, AI-correctable slide deck.\n\nThe pipeline:\n\n\n  \n    \n      Step\n      Tool\n      What It Does\n    \n  \n  \n    \n      1\n      NotebookLM\n      Generates the initial slide deck from your notes\n    \n    \n      2\n      Canva\n      Converts the PDF into an editable PPTX file\n    \n    \n      3\n      Google Slides\n      Provides a cloud-native editor with add-on support\n    \n    \n      4\n      Nano Banana Pro\n      Uses AI to fix slide content without breaking design\n    \n  \n\n\nThe principle is straightforward: AI generates, you convert, AI fixes. Let’s walk through each step.\n\nStep 1: Generate Your Slides in NotebookLM\n\nStart where the magic happens. Open NotebookLM [1], upload your source material (notes, documents, research papers), and let it generate a slide deck. The tool analyzes your content, identifies key themes, and produces a structured presentation.\n\n\n\nFigure. NotebookLM generates polished slide decks from your uploaded source material. Download the result as a PDF.\n\nThis gives you a visually polished deck, but one that is locked inside a static document. We need to break it out.\n\nStep 2: Convert PDF to PPTX Using Canva\n\nCanva [2] offers a free PDF-to-PPTX converter that does the heavy lifting of turning your static slides into editable PowerPoint format. You will need a Canva account (the free tier works).\n\nGo to Canva’s PDF to PPT Converter and upload your NotebookLM PDF.\n\n\n\nFigure. Canva’s free PDF to PPT converter parses your NotebookLM PDF into editable slide elements.\n\nCanva will parse the PDF into editable slides. Now, here is a critical detail that can save you frustration later.\n\nDo not just download the converted file directly. Instead, use the Share &gt; Microsoft PowerPoint export option. This preserves the aspect ratios of all images in the deck. The difference is subtle but important: a direct download can stretch or crop images when you open the file in another editor, while the Share export maintains fidelity.\n\n\n\nFigure. Canva provides multiple download options. The direct download works, but the Share export preserves image aspect ratios more reliably.\n\n\n\nFigure. Use Share &gt; Microsoft PowerPoint to export. This method preserves the original image dimensions and layout fidelity.\n\nStep 3: Import into Google Slides\n\nTake the PPTX file and import it into Google Drive. Open it, then select Save as Google Slides [3]. This converts the PowerPoint file into Google’s native slide format.\n\n\n\nFigure. Import the PPTX file into Google Drive and save as Google Slides to gain access to the full editing suite and add-on ecosystem.\n\nWhy Google Slides specifically? Three reasons:\n\n\n  Full editability. Every text box, image, and shape becomes individually editable.\n  Cloud-native collaboration. Share with your team for real-time review and refinement.\n  Add-on ecosystem. This is where the final piece of the puzzle lives.\n\n\nStep 4: Fix Content with Nano Banana Pro\n\nHere is where the workflow becomes genuinely clever.\n\nNano Banana Pro [4] is a Google Slides add-on that brings AI-powered editing directly into your slide deck. Instead of manually retyping text on slides (which often break..."
  },
  
  {
    "title": "Hear Your AI Agents Work in Claude Code",
    "url": "/hear-your-ai-agents-work",
    "date": "Jan 25, 2026",
    "categories": ["post"],
    "tags": ["AI","Voice","Text-to-Speech","Claude Code","Multi-Agent Systems","Developer Experience"],
    "excerpt": "\nWhen running five AI agents in parallel, how do you know what’s happening without watching five terminals? The answer is surprisingly low-tech: you listen. By adding voice notifications to Claude ...",
    "content": "\nWhen running five AI agents in parallel, how do you know what’s happening without watching five terminals? The answer is surprisingly low-tech: you listen. By adding voice notifications to Claude Code, we transformed silent terminal output into an ambient, audio-aware experience. This article shows how to build a voice notification system for any Claude Code agent using hooks, a local server, and the ElevenLabs API. Whether you’re running BMAD workflow agents, custom research agents, or your own multi-agent system, you can hear “Research complete” in one voice while “Build succeeded” arrives in another.\n\n\n\n\n\n\nFigure. The future of AI interaction might be more human than we expected. Sometimes the best interface is sound.\n\nThe Problem with Silent Agents\n\nModern AI development has given us something remarkable: the ability to delegate complex tasks to multiple AI agents working in parallel [4]. Claude Code’s Task tool lets us spawn specialized agents: researchers who dig through documentation, engineers who write code, architects who design systems. Agent frameworks like BMAD [6] add even more specialized personas: Mary the analyst who breaks down requirements, Winston the architect who designs solutions.\n\nBut there’s a problem. All these agents work silently.\n\nPicture the scene: you’ve asked Claude Code to research three companies, analyze a codebase for security issues, and draft a technical specification, all in parallel. Three agents spin up and get to work. You switch to another task, check email, or grab coffee. When you return, you have no idea which agents finished, which are still working, or if any encountered problems. Your only option is to check each terminal, read through outputs, and piece together the current state.\n\n\n\nFigure. The blind spot of silent agents: chaos and disconnection versus calm audio awareness.\n\nRunning parallel AI agents without voice notifications is like having five people working in separate rooms with no way to know when they finish except by checking each room repeatedly.\n\nThe insight that changed everything for us was simple: we have ears. Why not use them?\n\nThe Solution: Ambient Voice Notifications\n\n\n\nFigure. Transform Claude Code from visual-only to audio-aware. Different sounds for different agents enable passive monitoring.\n\nThe solution we built transforms Claude Code from a visual-only interface into an ambient, audio-aware experience [3]. When an agent completes its task, you hear about it, literally. Different agents speak with different voices, so you know who finished without looking at the screen.\n\nThe researcher announces in Domi’s analytical tone: “Found 5 papers on AI reasoning techniques.”\n\nThe engineer reports in Bella’s steady voice: “Refactored authentication module successfully.”\n\nThe architect summarizes in Antoni’s strategic cadence: “Designed microservices architecture with 4 services.”\n\nThis works with any Claude Code agent system:\n\n  Built-in agents: researcher, engineer, architect (via the Task tool)\n  BMAD agents [6]: Mary, Winston, and other workflow personas\n  Custom agents: Any agent you build that follows a simple convention\n\n\nLet’s explore how to build this system.\n\nSystem Architecture Overview\n\nThe voice notification system consists of four components working together:\n\n\n  \n    \n      Component\n      Role\n      Technology\n    \n  \n  \n    \n      COMPLETED Line Convention\n      Agents summarize their work in 12 words\n      Prompt engineering\n    \n    \n      Hook System\n      Detects agent completion, extracts message\n      Claude Code hooks (TypeScript)\n    \n    \n      Voice Server\n      Receives notifications, calls TTS API\n      Local HTTP server (Bun)\n    \n    \n      ElevenLabs API\n      Converts text to natural speech\n      External TTS service\n    \n  \n\n\n\n\nFigure. The four-component architecture inspired by the PAI framework [3]: agents produce COMPLETED lines, hooks detect and parse them [1], the voice server calls ElevenLabs [2], and audio plays through system speakers.\n\nThe key design decision is using different voices per agent type. This enables:\n\n  Identification without looking: Know who finished by ear alone\n  Cognitive offloading: Your brain processes audio passively while you focus elsewhere\n  Parallel awareness: Multiple completions don’t collide; each has a distinct voice\n\n\nThe COMPLETED Line Convention\n\nThe entire system is driven by a remarkably simple convention: agents include a COMPLETED: line in their responses that summarizes what they accomplished.\n\nCOMPLETED: Successfully analyzed the codebase and found 3 security issues.\n\n\nThis single line becomes spoken aloud through your speakers.\n\n\n\nFigure. The COMPLETED line convention: 12 words max, outcome-focused, natural language that sounds good when spoken aloud.\n\nRules for COMPLETED Lines\n\n\n  \n    \n      Rule\n      Reason\n    \n  \n  \n    \n      Maximum 12 words\n      Keeps voice output brief and natural\n    \n    \n      Past tense\n      Describes what was accomplished\n ..."
  },
  
  {
    "title": "Journey of AI-Led FoodInsight Development with BMAD",
    "url": "/journey-of-ai-led-foodinsight-development-with-bmad",
    "date": "Jan 13, 2026",
    "categories": ["post"],
    "tags": ["AI","Software Development","BMAD","Edge Computing","Food Detection","Claude Code"],
    "excerpt": "\nWhat if a software project’s most productive engineer never wrote a single line of code themselves? FoodInsight proves this isn’t hypothetical. With 29 stories, 98 story points, and approximately ...",
    "content": "\nWhat if a software project’s most productive engineer never wrote a single line of code themselves? FoodInsight proves this isn’t hypothetical. With 29 stories, 98 story points, and approximately 5000 lines of code delivered in 3 days with only 20 hours of human oversight, this edge AI food monitoring system demonstrates what becomes possible when AI agents drive the entire development lifecycle.\n\n\n\n\nFigure. Edge AI food detection: A Raspberry Pi camera monitors a dining table, detecting Japanese dishes in real-time with YOLO11. Each dish receives a bounding box and confidence score, all processed locally.\n\nThis article captures the complete development journey of FoodInsight, an edge AI food monitoring system built entirely through AI-led software development. From requirements gathering to implementation, AI agents orchestrated the entire lifecycle, demonstrating a new paradigm in human-AI collaborative engineering.\n\nThe Problem\n\nOffice break rooms and cafeterias share a common frustration: nobody knows what snacks are available until they walk over. Manual inventory tracking is tedious and quickly abandoned. Existing smart solutions require cloud dependencies, raising privacy concerns about camera footage leaving the premises.\n\nThe challenge: build a food monitoring system that is smart enough to detect items automatically, private enough to keep all images local, and simple enough to run on inexpensive edge hardware.\n\nThe Vision\n\nFoodInsight is a privacy-first, local-first food inventory monitoring system using YOLO11 object detection on Raspberry Pi edge devices. The system detects food items in real-time, tracks inventory changes, and provides a consumer-facing PWA for checking availability.\n\nKey Design Principles:\n\n  Edge-First: All inference happens locally on Raspberry Pi\n  Privacy-First: Only metadata (counts, not images) leaves the device\n  Local-First: SQLite database, no cloud dependencies required\n\n\n\n\nFigure. Local-first architecture: Camera frames flow through motion detection and YOLO11n inference on the edge device. Only inventory deltas (counts, not images) push to the local SQLite backend. The PWA polls for updates.\n\nBut building a system this complex with AI agents requires more than just prompting. Without structure, we’d face the very black-box problem that makes enterprise teams wary of AI-generated code. It requires a structured methodology. Enter BMAD.\n\nWhy BMAD? The Abstraction-Control Trade-off\n\nEnterprise leaders investing in AI-assisted development often find that promised productivity gains come at a steep price: loss of governance, traceability, and architectural integrity. Unstructured, prompt-driven AI creates “black box” codebases that are difficult to maintain, audit, and scale.\n\n\n\nFigure. The paradigm shift: Ad-hoc prompting to a monolithic AI produces brittle, context-free results. BMAD elevates developers from “prompters” to “architects of intelligent systems,” orchestrating specialized AI agents around a problem domain.\n\nThe Abstraction Trap\n\nAs programming abstractions increase, from assembly to high-level languages to natural language prompts, developers gain speed but face corresponding loss of control. With AI-generated code, this trade-off deepens significantly.\n\n\n\nFigure. Higher abstraction accelerates development but sacrifices control. AI tools amplify speed at the cost of precision and accountability.\n\nThe philosophical underpinning, as Harrison Ainsworth’s “Tractatus Computo Philosophicus” states: “Software is a logical construction, and its correctness must ultimately be left to human judgement” [1]. Current AI tools, while powerful, create a noticeable loss of control as abstraction increases, leaving developers struggling to ensure precision, track intent, and confidently modify AI-generated solutions.\n\nBMAD: From Chaos to Clarity\n\nThe BMAD Method (Breakthrough Method for Agile AI-Driven Development) provides the crucial framework to re-establish control [2]. It leverages specialized AI agents, each embodying a specific role: Analyst, Product Manager, Architect, Scrum Master, Product Owner, Developer, and QA.\n\nThe true transformation lies in overlaying this agentic process with robust Git-based versioning. Every artifact, from PRD to architecture to granular stories, is treated as a versioned asset:\n\n\n  \n    \n      Benefit\n      Description\n    \n  \n  \n    \n      Traceability\n      Every change from human or AI is tracked, eliminating the black-box effect\n    \n    \n      Collaborative Review\n      Stories undergo rigorous PR reviews where humans inspect AI-generated content\n    \n    \n      Single Source of Truth\n      All artifacts versioned and centralized; entire team operates from consistent blueprint\n    \n    \n      Enhanced Productivity\n      Automated creation of high-fidelity artifacts accelerates development cycles\n    \n  \n\n\n\n\nFigure. The control loop: Humans commit control manifests (PRDs, architecture docs, story specs) to Git. AI agents read ..."
  },
  
  {
    "title": "FoodInsight: Edge AI Food Monitoring with Local-First Architecture",
    "url": "/foodinsight-edge-ai-food-monitoring",
    "date": "Jan 12, 2026",
    "categories": ["post"],
    "tags": ["Edge AI","Computer Vision","YOLO","Food Detection","Local-First Architecture"],
    "excerpt": "\nWhat’s on the table? It’s a simple question, yet answering it reliably requires walking to the kitchen. FoodInsight changes this by combining edge AI with local-first architecture, transforming a ...",
    "content": "\nWhat’s on the table? It’s a simple question, yet answering it reliably requires walking to the kitchen. FoodInsight changes this by combining edge AI with local-first architecture, transforming a Raspberry Pi and camera into an intelligent food monitoring system. No cloud dependency. No images leaving your network. Just real-time visibility into what food is available, powered by YOLO11 and the UEC FOOD 100 dataset.\n\n\n\n\nFigure. FoodInsight in action: A Raspberry Pi camera monitors a dining table, detecting Japanese dishes in real-time with YOLO11. Each dish receives a bounding box and confidence score, from rice (0.94) to tempura (0.92), all processed locally on edge hardware.\n\nIn our previous article, You Only Look Once: 8 Years of Food Detection Evolution, we explored the remarkable journey from YOLO v2 to YOLO11 and trained a food detection model achieving 0.79 mAP50 on the UEC FOOD 100 dataset [1]. That model recognizes 100 categories of Japanese cuisine, from rice and miso soup to tempura and sushi. The article ended with a vision: what happens when you deploy this model beyond the desktop to monitor real food on real tables?\n\nThis article answers that question. We take the exact YOLO11 model trained in that previous article and deploy it as FoodInsight [2], a complete edge AI system that monitors food on a table, tracks what appears and disappears, and presents the current inventory through a simple phone app. The entire system costs under $140 in hardware, runs entirely offline, and processes everything locally. No cloud subscriptions. No data leaving your premises. No privacy concerns.\n\n\n\nFigure. FoodInsight transforms a Raspberry Pi with camera into an intelligent food monitoring station. Edge AI detects food items using YOLO11, tracks inventory changes with ByteTrack, and serves real-time status through a local PWA.\n\n\n\nThe Visibility Problem\n\nEvery shared kitchen tells the same story. Someone prepares food. Dishes appear on the table or counter. Others don’t know what’s there. They either check repeatedly, ask around, or miss the meal entirely. The result is cold food, wasted effort, and the nagging uncertainty of not knowing what’s available.\n\nThis pattern repeats across contexts:\n\n\n  Home kitchens: Family members don’t know when dinner is ready or what dishes are on the table\n  Shared meals: Guests miss dishes at the far end of the table, don’t know what options remain\n  Office lunch rooms: Employees walk to the kitchen just to see if the catered lunch has arrived\n  Cafeterias and buffets: Diners and staff don’t know which items need restocking until they’re gone\n\n\nThe solution seems obvious: take a photo and share it. But photos are static. They require someone to take them. They become stale within minutes. They don’t tell you when something changed.\n\nWhat if the monitoring happened automatically, continuously, and privately?\n\n\n\nEnter FoodInsight\n\nFoodInsight is our answer to the visibility problem. It’s a three-component system that monitors food in real-time while respecting privacy and operating entirely offline.\n\nThe Core Components\n\n\n  \n    \n      Component\n      Technology\n      Purpose\n    \n  \n  \n    \n      Edge Detection\n      Raspberry Pi + YOLO11n [3]\n      Detects and tracks food items\n    \n    \n      Local Backend\n      FastAPI + SQLite\n      Stores inventory, serves API\n    \n    \n      Consumer PWA\n      Vue 3 + Vite\n      Shows current food availability\n    \n  \n\n\nThe design philosophy is “local-first” [4]. Every piece of data stays on your network. The edge device processes video frames directly, never storing or transmitting images. Only inventory counts travel to the backend, and even that stays on your local network.\n\n\n\nFigure. FoodInsight local-first architecture: Camera frames flow through motion detection and YOLO11n inference on the edge device. Only inventory deltas (counts, not images) push to the local SQLite backend. The PWA polls for updates every 30 seconds.\n\nWhy Local-First?\n\nThe decision to build FoodInsight as a local-first system wasn’t arbitrary. It emerged from several constraints and principles:\n\nPrivacy by Design: Food monitoring cameras in shared spaces raise legitimate privacy concerns. By processing everything locally and never storing images, FoodInsight eliminates the “surveillance” feeling that would make users uncomfortable.\n\nZero Cloud Cost: Cloud backends incur ongoing costs, whether through compute, storage, or API calls. A local SQLite database costs nothing beyond the initial hardware.\n\nReliability: Local systems don’t fail when the internet goes down. For something as mundane as checking food availability, reliability matters more than features.\n\nSimplicity: Cloud infrastructure requires configuration, credentials, billing, and ongoing maintenance. Local SQLite requires a single file.\n\n\n\nThe Edge: Where Detection Happens\n\nThe heart of FoodInsight is the edge detection service running on a Raspberry Pi. This is where computer vision transforms..."
  },
  
  {
    "title": "You Only Look Once: 8 Years of Food Detection Evolution",
    "url": "/from-yolo-v2-to-yolo11-food-detection-evolution",
    "date": "Jan 11, 2026",
    "categories": ["post"],
    "tags": ["Deep Neural Network","Object Detection","YOLO","Food Detection","Machine Learning"],
    "excerpt": "\nEight years ago, we trained a YOLO v2 model to detect 100 types of Japanese food in real-time. It required compiling C code, writing custom Python scripts, and carefully tuning configuration files...",
    "content": "\nEight years ago, we trained a YOLO v2 model to detect 100 types of Japanese food in real-time. It required compiling C code, writing custom Python scripts, and carefully tuning configuration files. Today, the same task takes three commands and an afternoon. This is the story of how food detection, and deep learning tooling more broadly, has evolved from 2018 to 2026.\n\n\nIn 2018, we wrote about YOLO for Real-Time Food Detection [6], documenting our journey to build a “Food Watcher” using Joseph Redmon’s Darknet framework [1] and the UEC FOOD 100 dataset [2]. The result was impressive for its time: 70 fps food detection on a GTX TitanX, recognizing everything from sushi to hamburgers.\n\n\n  “The obsession of recognizing snacks and foods has been a fun theme for experimenting with the latest machine learning techniques.”\n\n\nThat obsession hasn’t faded. But the landscape has transformed dramatically. What once required a dedicated NVIDIA GPU, manual C compilation, and hours of configuration now runs on a MacBook with Apple Silicon, installs via pip, and trains while you sleep.\n\nThis post revisits food detection through the lens of YOLO11 [3] and the FoodInsight project [4], showing what’s changed, what’s stayed the same, and why 2026 is the best time ever to train your own object detection models.\n\n\n\nFigure. YOLO11 food detection pipeline: Input images from the UEC FOOD 100 dataset [2] flow through convolutional layers and a detection head, outputting bounding boxes with class labels and confidence scores for diverse Japanese dishes.\n\n\n\nThe Evolution of YOLO\n\nBefore diving into the technical details, it’s worth appreciating how far YOLO has come. The original “You Only Look Once” paper from 2015 [5] introduced a radical idea: instead of scanning an image thousands of times looking for objects, process the entire image in a single neural network pass.\n\n\n\nFigure. The evolution of YOLO from 2015 to 2024: From Joseph Redmon’s revolutionary single-pass detection concept through Ultralytics’ democratization of object detection, to today’s efficiency-focused YOLO11. Orange highlights mark key breakthroughs (v1, v5, v11), while cyan accents indicate major architectural innovations.\n\nThe story arc is remarkable. Joseph Redmon’s original Darknet implementation was written in pure C, fast but requiring careful compilation and platform-specific tweaks [1]. After Redmon stepped away from computer vision research in 2020, the community forked and evolved the codebase. Ultralytics emerged as the de facto standard, wrapping everything in a clean Python API that just works [3].\n\nFrom research code to production-ready library. From C compilation to pip install. From NVIDIA-only to Apple Silicon, Intel, and edge devices.\n\n\n\nThen vs Now: The Setup Comparison\n\nThe most striking difference between 2018 and 2026 is not the model architecture. It is the developer experience.\n\n2018: The Darknet Way\n\nHere’s what training YOLO v2 required in 2018:\n\n1. Download and compile Darknet (C code, Makefile tweaks for GPU)\n2. Download pre-trained weights manually (darknet19_448.conv.23)\n3. Write custom Python scripts for bounding box conversion\n4. Create food100.names file (class names, one per line)\n5. Create food100.data file (paths to train/test lists)\n6. Create yolov2-food100.cfg (50+ lines, manual filter calculation)\n7. Run training with precise command syntax\n8. Edit config again for testing (batch=1, subdivisions=1)\n\n\nThe configuration file alone required calculating filter counts manually:\n\n# number of filters calculated by (#-of-classes + 5)*5\n# e.g. (100 + 5)*5 = 525\nfilters=525\nclasses=100\n\n\nGet this wrong? Silent failures or garbage predictions.\n\n2026: The Ultralytics Way\n\nHere’s the modern equivalent:\n\n# Install (once)\npip install ultralytics\n\n# Convert dataset (once)\npython convert_uec_to_yolo.py datasets/UECFOOD100 --output datasets/UECFOOD100_yolo\n\n# Train\nyolo detect train data=data.yaml model=yolo11s.pt epochs=150 device=mps\n\n\nThat’s it. The conversion script generates a data.yaml file automatically:\n\npath: /path/to/UECFOOD100_yolo\ntrain: images/train\nval: images/val\ntest: images/test\nnc: 100\nnames: ['rice', 'eels on rice', 'pilaf', ...]\n\n\nNo manual filter calculations. No separate train/test configs. No C compilation. The framework handles everything.\n\nPain Points Eliminated\n\n\n  \n    \n      2018 Pain Point\n      2026 Solution\n    \n  \n  \n    \n      C compilation across platforms\n      Pure Python, pip install\n    \n    \n      Manual filter calculation\n      Automatic architecture\n    \n    \n      Separate train/test configs\n      Single model object\n    \n    \n      No built-in visualization\n      Integrated plots, metrics\n    \n    \n      NVIDIA GPU required\n      Apple Silicon, CPU, Intel\n    \n    \n      Manual weight downloads\n      Auto-download from hub\n    \n  \n\n\n\n\nTraining on Apple Silicon\n\nPerhaps the most surprising change: you no longer need an NVIDIA GPU.\n\nIn 2018, CUDA was the only path to reasonable training speeds. A GTX..."
  },
  
  {
    "title": "From Hand-Drawn to AI-Enhanced: Building a Mind Map Digitization Workflow",
    "url": "/mindmap-digitization-workflow",
    "date": "Jan 07, 2026",
    "categories": ["post"],
    "tags": ["Generative AI","Knowledge Management","Workflow Automation","Claude Code","Mind Mapping"],
    "excerpt": "\nMind maps capture thought in a way linear notes cannot, but hand-drawn maps remain trapped on paper, unsearchable and hard to share. In this article, we explore a workflow that transforms hand-dra...",
    "content": "\nMind maps capture thought in a way linear notes cannot, but hand-drawn maps remain trapped on paper, unsearchable and hard to share. In this article, we explore a workflow that transforms hand-drawn mind maps into AI-enhanced digital visualizations while preserving their organic structure. Using Claude Code as orchestrator and Nano Banana Pro for image generation, we have digitized over 80 mind maps into a consistent illuminated manuscript style.\n\n\n\n\nFigure. The transformation from hand-drawn sketch to illuminated manuscript style visualization.\n\nWhy Mind Mapping Matters\n\nVisual thinking activates spatial reasoning in ways that linear note-taking cannot [1]. When we draw a mind map, we engage with ideas differently. The central concept sits at the heart, branches radiate outward, and relationships emerge through proximity and connection. This spatial arrangement mirrors how our minds actually work, making complex topics easier to grasp and remember.\n\nFor years, we have accumulated hand-drawn mind maps, each one capturing a moment of learning or insight. Philosophy lectures, technical architectures, book summaries, project plans. Over 80 maps spanning topics from temporal logic to blockchain fundamentals. Each one represents genuine understanding, the kind that comes from actively processing information rather than passively recording it.\n\nBut hand-drawn mind maps have limitations:\n\n\n  Illegible to others: My handwriting and hand-drawn graphics are not readable to anyone but myself\n  Unsearchable: Finding a specific concept means flipping through stacks of paper\n  Hard to share: Physical artifacts do not travel well, and even scanned images remain cryptic\n  Static: Once drawn, they cannot evolve without redrawing\n  Fragile: Paper degrades, gets lost, or becomes damaged\n\n\nThe challenge was clear: how do we preserve the organic, creative nature of hand-drawn maps while gaining the benefits of digital storage, searchability, and shareability?\n\nThe Vision: Illuminated Manuscripts Meet Modern AI\n\nRather than simply scanning mind maps into static images, we envisioned something more ambitious. What if AI could transform rough sketches into beautiful, consistent visualizations while preserving every concept and relationship from the original?\n\nThe result is what we call the “illuminated manuscript” style, a visual aesthetic that combines:\n\n\n  Aged parchment backgrounds with warm beige tones\n  Color-coded organic branches radiating from central concepts\n  Topical icons at the center node (brains for AI topics, maps for spatial reasoning, etc.)\n  Clean typography replacing handwriting while maintaining readability\n  Rich illustrative elements throughout (symbols, diagrams, visual metaphors)\n\n\nThis style balances classical elegance with modern clarity. The mind maps become not just functional reference documents but genuinely beautiful artifacts.\n\nTwo Workflows for Different Needs\n\nWe developed two complementary workflows within our Personal AI Infrastructure (PAI) [3], each addressing a different starting point.\n\nTranslateMindMap: From Hand-Drawn to Digital\n\nThis workflow takes an existing hand-drawn mind map image and transforms it into a structured digital note with an AI-enhanced visualization.\n\nThe process:\n\nOriginal Image → Copy to images/ → Generate Nano → Analyze Nano → Create Note\n\n\n\n\nFigure. The TranslateMindMap workflow transforms hand-drawn mind maps into structured digital notes with AI-enhanced visualizations.\n\nThe key insight came during development: we analyze the AI-generated “Nano” version rather than the original. Why? The enhanced version has cleaner lines, standardized typography, and better contrast, making it easier for AI to extract the semantic structure accurately.\n\nTechnical components:\n\n\n  \n    \n      Component\n      Role\n    \n  \n  \n    \n      Claude Code [2]\n      Orchestrates the entire workflow\n    \n    \n      Nano Banana Pro [4]\n      Advanced image generation model\n    \n    \n      Style Reference\n      Existing Nano image ensures visual consistency\n    \n    \n      Obsidian [5]\n      Final destination for structured markdown notes\n    \n  \n\n\nGenerateMindMap: From Content to Visual\n\nSometimes we want to create a new mind map from existing content rather than digitizing a hand-drawn one. This workflow accepts text, URLs, files, or structured outlines and generates both a visualization and a structured note.\n\nInput types supported:\n\n\n  Text: Paste content directly for quick transformation\n  URL: Fetch and analyze web articles\n  File: Process existing markdown or text documents\n  Outline: Provide explicit structure for precise control\n\n\nThe process:\n\nContent → Extract Structure → Build Prompt → Generate Nano → Create Note\n\n\n\n\nFigure. The GenerateMindMap workflow creates new mind maps from text, URLs, files, or structured outlines.\n\nThis workflow is particularly useful for synthesizing research, planning projects, or creating visual summaries of dense material.\n\nThe Art of Prompt Engineer..."
  },
  
  {
    "title": "Process Philosophy for AI Agent Design Using a Whiteheadian Framework",
    "url": "/process-philosophy-for-ai-agent-design",
    "date": "Jan 01, 2026",
    "categories": ["post"],
    "tags": ["AI","Philosophy","Agent Architecture","Process Philosophy","Whitehead"],
    "excerpt": "\nWhat if everything we assume about AI agent architecture is fundamentally mistaken? Current agent frameworks treat agents as things that have states, persistent substances that receive inputs and ...",
    "content": "\nWhat if everything we assume about AI agent architecture is fundamentally mistaken? Current agent frameworks treat agents as things that have states, persistent substances that receive inputs and update properties. But this substance-based metaphysics leads to persistent engineering challenges: context that fragments across sessions, identity that depends on external configuration files, and no principled way to distinguish what an agent knows from what it merely guesses. In this article, we explore an alternative foundation drawn from Alfred North Whitehead’s process philosophy, reconceiving agents not as substances but as societies of actual occasions, patterns of becoming rather than things that are.\n\n\n\n\nFigure. The fundamental reconceptualization: agents as processes of becoming rather than static entities with states.\n\nThe Problem with Current Agent Architectures\n\nThe rapid advancement of large language models has enabled a new generation of AI agents capable of complex reasoning, tool use, and multi-step task execution. Yet beneath these impressive capabilities lies a fundamental architectural limitation: most agent frameworks treat each interaction as essentially independent, with continuity simulated through external mechanisms rather than constituted by the agent’s own nature.\n\nConsider the typical LLM-based agent. When state exists at all, it takes the form of unstructured text injected into a context window, specifically conversation history prepended to each new query. The agent does not remember its prior interactions; it is reminded of them. This distinction, though seemingly pedantic, reveals a deeper problem. The agent accumulates no genuine experience. Each session begins tabula rasa, with continuity maintained only by external systems that retrieve and inject relevant context. Even sophisticated approaches like Generative Agents [8] and MemGPT [9], while impressive, ultimately simulate continuity through external memory mechanisms rather than constituting it through the agent’s own nature.\n\nThis architecture produces several concrete engineering challenges:\n\n\n  \n    \n      Challenge\n      Description\n    \n  \n  \n    \n      Context Fragility\n      As conversations extend, earlier context falls out of the window or must be summarized, losing nuance and detail\n    \n    \n      Identity Brittleness\n      What makes an agent “the same agent” across sessions? Current systems answer this only through configuration files\n    \n    \n      Epistemic Opacity\n      When an agent claims to know something, is this genuine knowledge, retrieved information, or confabulation?\n    \n    \n      Experience Dissipation\n      Insights gained in one session do not automatically inform future sessions unless explicitly extracted\n    \n  \n\n\nThese are not merely implementation details to be optimized away. They reflect a fundamental assumption about what agents are, an assumption we argue is mistaken.\n\nThe Substance Metaphysics Trap\n\nThe limitations described above stem from an implicit metaphysical commitment that pervades contemporary agent design: substance metaphysics. Under this view, inherited from Aristotelian and Cartesian traditions, agents are conceived as things (substances) that have properties (states) which change over time. The agent is a persistent entity, a container, to which experiences happen and within which states are stored.\n\nThis substance-based framing leads naturally to the “state injection” pattern that dominates current architectures:\n\n\n  Memory becomes a database the agent queries rather than a constitutive element of what the agent is\n  Skills become tools the agent uses rather than patterns that define its capabilities\n  Identity becomes a label assigned from outside rather than a characteristic emergent from the agent’s own history\n\n\nBut what if this framing is wrong? What if agents are not things at all, but processes?\n\nEnter Whitehead: Process Over Substance\n\nAlfred North Whitehead (1861-1947) began his career as a mathematician and logician, co-authoring with Bertrand Russell the monumental Principia Mathematica. In his later work, particularly Process and Reality (1929) [1], Whitehead developed a comprehensive metaphysical system that challenges fundamental assumptions of Western philosophy since Aristotle [2][3][4].\n\n\n\nFigure. A comprehensive overview of Whitehead’s process philosophy, showing its three pillars (Evolutionary Idealism, Modern Science, Alexandrian Church Fathers), the anatomy of perception events, and the three elements present in every cosmic event.\n\nThe core of Whitehead’s challenge is this: the Western philosophical tradition has assumed that reality is fundamentally composed of substances, enduring things that persist through change while maintaining their identity. Properties come and go, but the substance remains. Whitehead argues this gets reality exactly backwards [1]. The fundamental units of reality are not substances but actual occasions, moments of expe..."
  },
  
  {
    "title": "Harmonizing Two AI Agent Systems - A Practical Journey",
    "url": "/harmonizing-two-ai-agent-systems",
    "date": "Dec 15, 2025",
    "categories": ["post"],
    "tags": ["AI","Agent Architecture","BMAD","PAI","System Integration"],
    "excerpt": "\nWhat happens when you have two capable AI agent systems that each excel at different things, but cannot work together? This article documents our journey integrating BMAD’s deterministic workflows...",
    "content": "\nWhat happens when you have two capable AI agent systems that each excel at different things, but cannot work together? This article documents our journey integrating BMAD’s deterministic workflows and rich agent personas with PAI’s intent-based activation and flexible orchestration. The solution was not to choose one over the other, but to compose them at different layers of the stack, preserving both strengths while creating something more powerful than either alone.\n\n\n\n\nFigure. The challenge of integrating two AI agent systems: BMAD’s structured workflows meet PAI’s natural language activation.\n\nThe Problem: Two Good Systems That Do Not Talk\n\nWe faced a problem many AI practitioners encounter: two capable systems that each did something well, but could not work together.\n\nBMAD (BMad Method) [3][4] is a structured workflow system with 30 specialized agent personas. Mary the Business Analyst brings systematic requirements elicitation. Winston the Architect provides rigorous system design. Carson the Brainstorming Coach offers 35 creative techniques. Each agent has a rich identity including communication style, principles, and expertise areas. The workflows are deterministic, template-driven, and produce consistent artifacts like PRDs, tech specs, and game design documents.\n\nPAI (Personal AI Infrastructure) [1][2] is a personal AI system built around Claude Code. It uses natural language activation, meaning you say “I need a PRD for…” rather than memorizing explicit commands. It excels at parallel agent execution and flexible model selection, sending grunt work to Haiku and complex reasoning to Opus.\n\nBoth systems worked. Neither talked to the other. Using BMAD meant abandoning PAI’s natural invocation. Using PAI meant losing BMAD’s rich personas and validated templates.\n\nThe question was not “which system is better?” It was “how do we preserve both strengths?”\n\nThe Key Insight: Layered Architecture\n\nThe breakthrough came from recognizing that BMAD and PAI operate at different layers of the stack.\n\n\n\nFigure. The composition pattern: PAI handles activation and routing, BMAD handles execution with rich personas.\n\nPAI handles activation. When we say “let’s brainstorm ideas for improving onboarding,” PAI’s intent matching recognizes this as a brainstorming request and routes it appropriately.\n\nBMAD handles execution. Once routed, BMAD’s Carson persona takes over with his 35 brainstorming techniques, structured facilitation approach, and energetic communication style.\n\nBoth systems contribute what they do best. Neither replaces the other.\n\nThis is the composition pattern in practice: different systems operating at different layers, each contributing their strengths without stepping on each other’s capabilities.\n\nThe Implementation: BmadBridge\n\nThe integration took shape as a PAI skill [5][6] called BmadBridge. Here is what it does:\n\n1. Intent-Based Routing\n\nInstead of memorizing BMAD commands like *prd or *brainstorm, we just talk naturally:\n\n\n  \n    \n      Natural Request\n      Routes To\n    \n  \n  \n    \n      “Create a PRD for my project”\n      John (PM) + PRD workflow\n    \n    \n      “Let’s brainstorm ideas”\n      Carson + Brainstorming workflow\n    \n    \n      “Help me understand this legacy code”\n      Dr. Ada + Software Archaeology\n    \n    \n      “Design an ontology for this domain”\n      Alexander + Ontology Architecture\n    \n  \n\n\nThe skill’s USE WHEN triggers handle the pattern matching. No special syntax required.\n\n2. Pre-Extracted Personas\n\nBMAD’s agent files are verbose, containing XML-embedded markdown with activation protocols, menus, and persona definitions. Loading them at runtime was slow and cluttered the context.\n\nThe solution: extract just the persona essentials to clean YAML files.\n\n# personas/analyst.yaml\nid: analyst\nname: Mary\ntitle: Business Analyst\nicon: \"...\"\nmodule: bmm\n\npersona:\n  role: \"Strategic Business Analyst + Requirements Expert\"\n  identity: |\n    Senior analyst with deep expertise in market research,\n    competitive analysis, and requirements elicitation...\n  communication_style: |\n    Analytical and systematic in approach...\n  principles: |\n    Every business challenge has underlying root causes...\n\n\nNow persona loading is instant. The agent’s voice comes through without the overhead.\n\n3. Unified Output Location\n\nBMAD workflows originally scattered outputs across project-specific locations. The integration unified everything:\n\n~/.claude/History/bmad/\n├── prd/           # PRD documents\n├── tech-spec/     # Technical specifications\n├── stories/       # User stories\n├── onto/          # Ontology artifacts\n└── sar/           # Software archaeology docs\n\n\nOne place to find all structured artifacts, regardless of which workflow created them.\n\nThe Agent Roster: 30 Specialists\n\nWhy Agent Personas Matter\n\nA generic “assistant” produces generic outputs. But when an agent has a defined identity, communication style, and guiding principles, something interesting happens: the quality of..."
  },
  
  {
    "title": "Abstraction of Thought Makes AI Better Reasoners",
    "url": "/abstraction-of-thought-makes-ai-better-reasoners",
    "date": "Dec 01, 2025",
    "categories": ["post"],
    "tags": ["AI","Abstraction","Reasoning","LLM","Knowledge Representation"],
    "excerpt": "\nChain-of-Thought made LLMs seem smart. But what if step-by-step thinking is the problem, not the solution? Human experts don’t just plow through problems sequentially. Instead, we build mental mod...",
    "content": "\nChain-of-Thought made LLMs seem smart. But what if step-by-step thinking is the problem, not the solution? Human experts don’t just plow through problems sequentially. Instead, we build mental models, identify overarching principles, and sketch out strategies before diving into details. This article explores Abstraction of Thought (AoT), a structured reasoning format that explicitly incorporates multiple levels of abstraction, demonstrating how teaching AI to think at different levels of abstraction dramatically improves reasoning performance.\n\n\n\n\nFigure. A cartographic abstraction of the reasoning landscape — from the “Abstract Heights” of high-level principles down through “Concrete Valleys” of detailed operations. Like explorers charting unknown territories, we navigate through “Hierarchical Forests,” cross “Chain of Thought” rivers, and venture into “The Uncharted Depths” where reasoning challenges await. This map serves as a metaphor for how abstraction organizes our exploration of complex problem spaces.\n\nThe quest to imbue Artificial Intelligence with genuine reasoning capabilities remains one of the most significant challenges, and opportunities, in the field. While Large Language Models (LLMs) demonstrate remarkable fluency and knowledge retrieval, pushing them beyond pattern matching towards deeper understanding and complex problem-solving requires fundamentally new approaches. Current techniques like Chain of Thought (CoT), Tree of Thoughts (ToT), and Graph of Thoughts (GoT) represent important strides, primarily focusing on enumerative or sequential exploration of reasoning steps. However, a crucial element often missing is the ability to think abstractly, a hallmark of human cognition when tackling complex problems.\n\nThis article closely aligns with ideas from recent research on Abstraction-of-Thought [1] while connecting it to foundational work on abstraction in AI [2] and the broader challenge of measuring intelligence through abstraction [3]. We shall explore how structured, multi-level abstract thinking could be the key not only to unlocking superior AI reasoning but also to making these powerful systems more understandable, reliable, and perhaps even controllable.\n\nThe Limitations of Linearity\n\nExisting methods like Chain of Thought often encourage LLMs to generate a linear sequence of detailed steps. While effective for certain tasks, this approach can resemble plowing through a problem without first surveying the landscape. For truly complex challenges, human experts rarely operate this way. We build mental models, identify overarching principles, sketch out strategies, and then fill in the operational details. We leverage abstraction.\n\n\n\nFigure. Reasoning with abstraction attempts to answer questions from the perspective of abstract essences, which may be overlooked by step-by-step Chain-of-Thought (CoT) reasoning. The lower levels (blue nodes) perform concrete reasoning rich in detail, while higher levels (red nodes) are abstractions that organize the entire reasoning process.\n\nConsider solving a quadratic equation, $ax^2 + bx + c = 0$. A purely sequential approach might immediately start substituting values. Abstraction-of-Thought, however, mirrors a more expert process:\n\n\n  Abstract Level: First identify the type of problem: “This is a quadratic equation.”\n  Strategic Level: Access the general principle for solving it: “The standard solution uses the quadratic formula: $x = \\frac{-b \\pm \\sqrt{b^2 - 4ac}}{2a}$.”\n  Concrete Level: Only then proceed to substitute the specific values of a, b, and c.\n\n\nThis initial abstract identification provides a robust blueprint. The model isn’t just solving one equation; it understands the class of problems and the general method applicable to all instances of that class. This feels closer to genuine understanding than rote calculation.\n\nWhat is Abstraction?\n\nBefore diving deeper into how abstraction improves AI reasoning, we need to understand what abstraction actually means. Far from being a dusty academic concept, abstraction is a vital, pervasive tool that shapes how we interact with the world, how we think, and how we build intelligent systems.\n\nThe word literally means “to draw away,” but its meaning has evolved. It’s not just about taking things away; it’s about purposeful simplification. As Goldstone and Barsalou put it, “To abstract is to distill the essence from its superficial trappings” [5]. This process is found everywhere, in philosophy, mathematics, natural language, art, and our own cognitive processes. In computer science, it is completely vital; without abstraction, we couldn’t possibly manage the sheer scale and intricacy of modern software.\n\nThe Modus Operandi of Abstraction\n\nBased on the KRA (Knowledge Representation &amp; Abstraction) model presented by Saitta and Zucker [2], abstraction operates through several key mechanisms:\n\n1. Focalization and Grouping\n\n\n\nFigure. Focalization filters out irrelevant information to focus ..."
  },
  
  {
    "title": "Ontology - The Queryable Brain of Software Archaeology",
    "url": "/ontology-the-queryable-brain-of-software-archaeology",
    "date": "Nov 01, 2025",
    "categories": ["post"],
    "tags": ["Ontology","Software Archaeology","AI Engineering","BMAD"],
    "excerpt": "\nEvery seasoned developer knows the feeling. You inherit a complex, business-critical system where the original architects are long gone. There are multiple databases, a web of microservices, and t...",
    "content": "\nEvery seasoned developer knows the feeling. You inherit a complex, business-critical system where the original architects are long gone. There are multiple databases, a web of microservices, and the only documentation is a handful of outdated wiki pages. Your job is to add a new feature or fix a critical bug, but first, you have to answer a seemingly simple question: “If I change this, what breaks?”\n\n\nThe Legacy System Nightmare\n\nThis isn’t just a problem for decade-old monoliths. Consider the “Software Archaeology Tools Platform,” a production system that is only six months old. It already has eight microservices, three different databases (Neo4j, PostgreSQL, and Redis), 47 API endpoints, and 12 database tables with no clear ownership. To compound the issue, imagined if the team has already lost two of its three original developers, which is a 66% turnover rate, taking critical institutional knowledge with them. The new team is left trying to piece together how data flows through this intricate system.\n\nThere is a powerful way to cut through this complexity. By creating a formal semantic model of your system, an Software Archaeology Ontology (refer as SAR ontology in this article), you can build a “queryable brain” that holds a persistent, accurate, and discoverable map of its architecture and data flows. This post breaks down the five most impactful takeaways from applying this exact approach to the production system.\n\nConsider this article is the missing puzzle piece of software archaeology, using BMAD-V6 SAR module to understand a legacy system, and BMAD-V6 ONTO module to document and analyze what has been done. Subsequently, the knowledge can be used to modernize the legacy system with strong evidence and confidence of impact.\n\n\nFigure. The “Astronaut Pointing a Gun” meme is repurposed to illustrate the self-referential power of the Software Archaeology Tools Platform. This layered analysis establishes the Software Archaeology Ontology as the “queryable brain” for confident legacy system modernization, forming the completed puzzle of Software Archaeology.\n\n1. You Can Literally Ask Your Codebase Questions\n\nTraditionally, understanding data lineage is a painful, manual hunt. It’s an archaeology dig through unfamiliar code, outdated wikis, and the hazy memories of the last senior engineer who touched the code. It’s guesswork, not engineering.\n\nAn ontology-driven approach shatters this paradigm. Instead of detective work, you get deterministic answers. By modeling the system’s components (e.g. services, APIs, databases, tables) and the relationships between them (e.g. dependsOn, calls, produces, writesTo), you create a knowledge graph. With this graph in place, you can use a query language like SPARQL to ask precise, complex questions and get definitive answers.\n\n\n\nFigure. Illustrate the partial SAR Ontology modeling the Software Entity hierarchy. The ontology can be loaded into the popular Protégé platform for maintenance and visualization needs. We can also use Python programmatically manage and query the ontology, using rdflib or owlready2.\n\nFor example, instead of manual guesswork, you can now ask:\n\n\n  “Which components have the most dependencies (coupling hotspots)?”\n  “Which API endpoints write to which database tables?”\n  “When a user interacts with the React UI dashboard, where is data stored?”\n  “What technical debt items have the highest priority for remediation?”\n\n\nCrucially, the answers to these questions are not hardcoded. They are discovered in real-time by tracing the relationships that have been formally modeled in the ontology. This turns system understanding from an art into a science. The beauty of this model is that even an incomplete answer is valuable. A query that returns nothing doesn’t mean the tool failed; it means you’ve discovered a gap in your knowledge model that needs to be filled.\n\nKey Insight: The quality of data lineage tracing depends on how completely the system relationships are modeled in the SAR ontology. Gaps in the results indicate areas where the ontology needs enhancement.\n\nFrom Ontology to Knowledge Graph Explained\n\nThe ontology serves as the blueprint or formal semantic model (or schema) that dictates the structure and relationships within a domain, defining concepts like Database and how it can relate to a Service using properties such as sar:dependsOn. In contrast, the knowledge graph is the actual data repository that holds the specific, real-world facts—the complete Software Archaeology Platform dataset in this case—which is represented as instance triples derived from and governed by the ontology schema.\n\n\n\nFigure. Knowledge Graphs are machine readable data structures that represent knowledge of the physical and digital worlds as modeled in an ontology.\n\nTherefore, the knowledge graph is built upon the ontology, combining the abstract structural definitions (the schema triples) with the concrete data (the instance triples). This combination allows compl..."
  },
  
  {
    "title": "BMAD V6 Intellectual Ecosystem for Understanding Legacy Software",
    "url": "/bmad-v6-intellectual-ecosystem",
    "date": "Oct 17, 2025",
    "categories": ["post"],
    "tags": ["BMAD","AI Agents","AI Philosophy","Software Archaeology"],
    "excerpt": "\nWhen we think about interacting with AI, the image that often comes to mind is a simple conversation: we ask a question, and the AI gives us an answer. We use prompts to generate content, analyze ...",
    "content": "\nWhen we think about interacting with AI, the image that often comes to mind is a simple conversation: we ask a question, and the AI gives us an answer. We use prompts to generate content, analyze data, or write code. In this model, AI is a powerful but ultimately reactive tool, an oracle waiting for our next query. This approach unlocks a fraction of its potential, especially when facing truly complex, systemic challenges.\n\n\n\nFigure. A mental model is more than a way of thinking — it’s the architecture of understanding itself.\n\nBut what if we moved beyond simply prompting an AI? A framework called BMAD-V6 (Breakthrough Method for Agile AI-Driven Development - version 6) presents a more profound perspective. It frames AI not as a tool to be queried, but as a component within a complete intellectual ecosystem. It offers a philosophy of systematic problem-solving, one that redefines how humans and AI can collaborate to deconstruct, understand, and resolve intricate challenges. This article explores five impactful ideas from this philosophy that can change how you think about AI and problem-solving.\n\n\n  Why BMAD-V6 Matters? The BMAD-V6 framework (currently in alpha) represents a major evolution of the BMAD philosophy. Its most significant addition, the BMAD Builder module, enables users to create new modules within the BMAD ecosystem itself. This means we’re no longer limited to static components. With BMAD-V6, we can extend and customize the system dynamically, ensuring that new agents and workflows integrate seamlessly and operate cooperatively as part of a unified, intelligent ecosystem.\n\n\n1. First, You Must Tame the Chaos\n\nThe BMAD V6 framework’s first step is not to ask a question but to perform a rigorous act of definition. It demands a clear Problem Statement, an identification of the problem’s Impact, the articulation of the solution’s Value Propositions, and a deliberate statement of what is “In Scope” and “Out of Scope.” This initial discipline is philosophically critical. It is analogous to the act of naming in science, which transforms a chaotic, amorphous problem, like “our legacy code is a mess” into a bounded, addressable challenge.\n\nThis act of definition forces clarity before any action is taken. It asserts that you cannot solve a problem you have not first defined. By creating boundaries, the framework converts an overwhelming and seemingly infinite problem space into a finite territory that can be explored methodically. It creates order from chaos through deliberate definition.\n\nBy drawing a line in the sand, it transforms an infinite, overwhelming domain into a finite territory that can be methodically explored.\n\nOnce a problem is defined and bounded, the next step is to assemble the right kind of intelligence to explore it.\n\n2. Build a Team of Personas, Not a Monolithic AI\n\nInstead of creating a single, monolithic “helper” AI, this philosophy insists on building a team of specialized AI agents. This is a profound philosophical choice and a radical departure from conventional thinking. It suggests that true expertise is not a generic, singular intelligence but a tapestry of distinct, specialized perspectives. Each agent is a “personified philosophy of practice,” embodying an unique mode of expert thinking.\n\n\nFigure. From Linear Logic to Living Systems: Rethinking AI as an Ecosystem of Experts Instead of building a single, monolithic “helper” AI, this philosophy embraces systems thinking—a shift from linear cause-and-effect reasoning to dynamic networks of interdependent parts. Each AI agent becomes a specialized node in a living system of intelligence, embodying its own philosophy of practice and unique mode of expert reasoning.\n\nFor example, a team designed to analyze legacy software might include:\n\n\n  Dr. Ada (the Archaeologist): Embodies the principle of systematic, patient discovery. Her philosophy represents the deep, methodical investigation required to understand the why behind the code, operating under the belief that “Context is everything.”\n  Atlas (the Code Cartographer): Embodies the philosophy of visual thinking. Her core principle, “A picture is worth 10,000 lines of code,” asserts that complex relationships can only be truly grasped when they are mapped spatially.\n  Morgan (the Documentation Curator): Embodies a user-centric philosophy of clarity. His belief that “Documentation should answer questions before they’re asked” shifts the focus from simple recording to proactive, empathetic knowledge transfer.\n\n\n\nFigure. An illustration of different persona, defining the AI agent identity, communication style and principles that guide the AI mental model of practice and unique mode of expert reasoning.\n\nThe power of this approach lies not just in the individual experts, but in designing the system of interaction between them. By designing agents this way, we codify different professional philosophies and allow them to collaborate.\n\nThe solution to the problem emerges not from o..."
  },
  
  {
    "title": "Applied BMAD - Reclaiming Control in AI Development",
    "url": "/bmad-reclaiming-control-in-ai-dev",
    "date": "Oct 01, 2025",
    "categories": ["post"],
    "tags": ["BMAD","AI Engineering","AI Agents","Spec Coding"],
    "excerpt": "\nEnterprise leaders are pouring investment into AI-assisted development, but many are finding that the promised productivity gains come at a steep price: a loss of governance, traceability, and arc...",
    "content": "\nEnterprise leaders are pouring investment into AI-assisted development, but many are finding that the promised productivity gains come at a steep price: a loss of governance, traceability, and architectural integrity. Unstructured, prompt-driven AI use is creating “black box” codebases that are difficult to maintain, audit, and scale, introducing significant business risk. To move from ad-hoc experimentation to enterprise-grade AI integration, a new paradigm is required.\n\n\nWe applied BMAD Method (Breakthrough Method for Agile AI-Driven Development) as a strategic framework to ensure that AI-driven development is not only fast but also predictable, compliant, and aligned with long-term business objectives. Integrated with robust version control, this method offers a transformative paradigm for AI engineering, enabling super individual to work with unparalleled precision and clarity.\n\n\nFigure. Visualizing the transformation from fragmented, high-risk AI experimentation to BMAD’s structured, enterprise-ready development framework\n\nThe Abstraction Trap: Loss of Control\n\nThe core problem stems from the inherent tension between abstraction and control in AI-assisted coding. As illustrated in the AI-generated code process, a high-level natural language prompt is transformed through AI models into source code, eventually becoming a running program. While this offers immense speed, it often acts as a black box.\n\n\nFigure. The typical AI code generation process today, illustrating a linear transformation from a high-level prompt to a running program. This “black box” approach often requires developers to iterate extensively on prompts, restarting the process with each change and hoping the AI interprets and corrects issues effectively (source: Philomatics 2024).\n\nAs programming languages and tools increase in their level of abstraction, developers gain immense speed but face a corresponding loss of control. Writing low-level code offers precision and accountability, while high-level code introduces efficiency at the expense of some oversight. With natural language prompts as our programming language to AI-generated programs, this trade-off deepens.\n\n\nFigure. Higher abstraction accelerates development but sacrifices control, with AI tools amplifying speed at the cost of precision and accountability (source: Philomatics 2024).\n\nThe philosophical underpinning, as Harrison Ainsworth’s “Tractatus Computo Philosophicus” eloquently states, is that “Software is a logical construction,” and “its correctness must ultimately be left to human judgement.” Current AI tools, while powerful, lead to a noticeable “Loss of control” as the level of abstraction increases. This gap leaves developers and consultants struggling to ensure precision, track intent, and confidently modify AI-generated solutions. When an AI produces code that isn’t quite right, the iterative process of refining prompts becomes a guessing game, devoid of structured collaboration or versioned accountability. This unstructured AI assistance, beneficial for individual tasks, falls short in multi-person, multi-stage team environments, leading to disparate outputs and a lack of collective understanding.\n\nFrom Chaos to Clarity: The BMAD Method\n\nThe BMAD Method provides the crucial framework to re-establish control. At its heart, BMAD leverages specialized AI agents, each embodying a specific role within the software development lifecycle: Analyst, Product Manager, Architect, Scrum Master, Product Owner, Developer, and QA.\n\nUnlike traditional AI tools that merely assist an individual, BMAD agents collaboratively generate and refine critical project artifacts, ensuring an unprecedented level of precision from the project’s inception.\n\nThe true transformation, however, lies in overlaying this agentic process with robust Git-based versioning. Every artifact, from the Product Requirements Document (PRD) to the architecture and granular stories, is treated as a versioned asset. This crucial step ensures:\n\n\n  \n    Traceability and Accountability: Every change, whether from a human or an AI agent, is thoroughly tracked. This eliminates the black-box effect, providing a clear audit trail that is crucial for regulated industries and internal governance standards.\n  \n  \n    Collaborative Review and Refinement: Stories undergo rigorous pull request (PR) reviews where humans and AI agents inspect and comment on AI-generated content. This structured feedback loop prevents the accumulation of technical debt and reduces post-release bugs, directly lowering maintenance costs.\n  \n  \n    A Single Source of Truth: With all planning and implementation artifacts versioned and centralized, the entire team operates from a consistent blueprint. This prevents costly context loss between teams and reduces the rework common in projects with siloed information.\n  \n  \n    Enhanced Productivity and Quality: By automating the creation of high-fidelity project artifacts, BMAD significantly acceler..."
  },
  
  {
    "title": "The Rise of the AI-Powered Super Individual",
    "url": "/rise-of-the-ai-powered-super-individual",
    "date": "Aug 22, 2025",
    "categories": ["post"],
    "tags": ["AI","Philosophy","Life Strategy","Future of Work"],
    "excerpt": "\nThe professional identity you’ve spent years building is becoming a liability. The specialized skills and hard-won experience that once guaranteed your value are being systematically eroded by a f...",
    "content": "\nThe professional identity you’ve spent years building is becoming a liability. The specialized skills and hard-won experience that once guaranteed your value are being systematically eroded by a force that learns faster than any human ever could. This isn’t a distant forecast; it’s a tectonic shift happening right now. Artificial intelligence is not just automating tasks; it’s dismantling the very foundations of the modern workplace, making old methods and established expertise a dangerous anchor in a rapidly changing world.\n\n\nWhile this transformation renders old career paths obsolete, it also clears the way for a new kind of professional to emerge. The advantage is shifting away from what you know to how you think. For those who can adapt, leveraging this disruption is the greatest opportunity of our generation, a chance to redefine productivity and influence on an unprecedented scale. This is not another academic debate about the future of work; it is a practical guide for surviving, and thriving, in the here and now.\n\n\n\nTL;DR\n\n\n  Experience is Devalued, Meta-Skills are Crucial: AI is making years of accumulated, specialized knowledge obsolete, placing a new premium on timeless “meta-capabilities” like problem-solving, critical thinking, and decisive judgment.\n  The Rise of the “Super Individual”: AI empowers individuals to achieve the output of entire teams by acting as an “external brain” or a manager of “silicon intelligences,” leading to unprecedented levels of productivity.\n  AI is Eliminating “Fake Work”: The bureaucratic “glue work”—excessive meetings, coordination, and manual data-shuffling between software systems—that arose from organizational inefficiencies is being automated, streamlining workflows.\n  The Playing Field is Leveled for Young Innovators: By providing powerful tools and democratizing access to information, AI removes the traditional barriers to entry, allowing young, adaptable individuals to compete with and even outperform seasoned professionals burdened by old methods.\n  A New Era of Lean, AI-Native Entrepreneurship: The future belongs to small, agile teams and “super individuals” who can leverage AI to build powerful applications without the need for large organizational structures, shifting the focus from B2B tools to a wider range of consumer and government solutions.\n\n\nA Survival Guide for the AI-Powered Workforce\n\nFor decades, the career ladder was built on a simple premise: accumulate experience. The more years you put in, the more specialized knowledge you gained, the more valuable you became. That premise is now broken. Artificial intelligence is not just another tool; it’s a fundamental shift that is actively devaluing traditional experience and dismantling the very structure of the modern workplace.\n\nThe changes coming in the next two to five years will be swift and unforgiving. The comfortable routines, the bureaucratic processes, and even the skills that define your professional identity are being rendered obsolete. This isn’t a distant threat, it’s a present reality. Positions are already being eliminated, not by economic downturns, but by a manager’s simple choice: “Do you want more headcount or more computing power?”. Companies are already quantifying the trade-off in stark terms; Uber’s internal AI “Genie” saved 13,000 engineering hours, while Commercial Bank of Dubai saved 39,000 hours annually with Microsoft 365 Copilot.\n\nThe good news is that this disruption creates an unprecedented opportunity. While AI punishes those who cling to the past, it massively empowers those who adapt. A new class of professional is emerging: the “super individual.” This isn’t about being a technical genius; it’s about cultivating a new set of core skills and learning to manage AI as an extension of your own intelligence. This is your guide to becoming one.\n\nThe Great Devaluation: When Experience Becomes Baggage\n\nThe core of the disruption is this: AI has leveled the playing field for experience and technical accumulation. The complex tasks and industry knowledge that once took years to master can now be replicated or learned in a fraction of the time. A controlled experiment showed developers using GitHub Copilot completed a coding task 55.8% faster than those without it. As one expert notes, “A university student, or even a high school student, can start working on almost the same starting line as an engineer with ten years of experience, and without the cognitive baggage.”\n\nThis “cognitive baggage” is the single greatest threat to experienced professionals. Your hard-won identity: “I’m a Java expert,” “I’m a senior marketer,” “I’m a seasoned analyst”, becomes a sunk cost that makes it difficult to pivot. The evidence for this shift is compelling. A study of 5,179 customer support agents found that while AI assistance increased overall productivity by 14%, the biggest gains were seen by novice and low-skilled workers, whose performance improved by up to 34%. Conversely, the m..."
  },
  
  {
    "title": "An AI-Powered Approach to Structural Abstraction - The KRA Model in Reverse Engineering",
    "url": "/ai-powered-approach-to-structural_abstraction",
    "date": "May 14, 2025",
    "categories": ["post"],
    "tags": ["AI","Abstraction","Reverse Engineering","Philosophy"],
    "excerpt": "\nEver felt utterly swamped? Faced with a complex system, a mountain of data, or a dense technical document, it can feel like trying to drink from a fire hose. Raw, unfiltered information at scale i...",
    "content": "\nEver felt utterly swamped? Faced with a complex system, a mountain of data, or a dense technical document, it can feel like trying to drink from a fire hose. Raw, unfiltered information at scale is simply overwhelming. Cutting through that noise, finding the core essence, and truly understanding the heart of complex ideas is a fundamental challenge.\n\n\nThis is where abstraction comes in. Far from being a dusty, academic concept, abstraction is a vital, pervasive tool that shapes how we interact with the world, how we think, and critically, how we build and understand intelligent systems. Looking through the lens of the KRA (Knowledge Representation &amp; Abstraction) model [1], and considering how AI can be leveraged, we can see abstraction not just as a philosophical notion, but as an operational necessity for navigating complexity.\n\n\n\nFigure. Overwhelmed by Complexity: In a world of unfiltered information and intricate systems, abstraction becomes the key to clarity and understanding.\n\nThe striking thing about abstraction is its universality. Its roots literally mean “to draw away,” but its meaning has evolved. It’s not just about taking things away; it’s about purposeful simplification. “To abstract,” as Goldstone and Barsalou [4] put it, “is to distill the essence from its superficial trappings.” This process is found everywhere – in philosophy formalizing how we form abstract ideas, in mathematics finding general patterns beyond specific numbers, in natural language, in art, and in our own cognitive processes. In computer science, it is completely vital; without abstraction, we couldn’t possibly manage the sheer scale and intricacy of modern software or data structures. We’d be drowning in the weeds.\n\nThis article closely aligns with the ideas presented by Saitta and Zucker in “Abstraction In Artificial Intelligence And Complex Systems” [1] while adding our own perspective on how abstraction can be practically applied in the context of reverse engineering [2,3].\n\nAbstraction in AI\n\nAnd while its presence is felt across many domains, abstraction becomes absolutely essential when we talk about Artificial Intelligence. For AI to tackle really tough problems - planning complex sequences of actions, solving large-scale constraint satisfaction problems like scheduling, representing vast amounts of knowledge, or enabling agents (like robots) to function in the real world; it cannot operate solely on ground-level detail. It needs abstract views. As we surveyed recent researches, abstraction isn’t merely a nice-to-have for advanced AI; it’s foundational. Without a solid grasp and application of abstraction, much of the sophisticated AI we envision simply wouldn’t be possible.\n\n\n\nFigure. Abstraction across Disciplines: Highlighting its fundamental role in Philosophy, Mathematics, Art/Natural Language, and Computer Science.\n\nGiven its importance, defining abstraction precisely is, perhaps fittingly, tricky. There isn’t one single universally agreed-upon definition. Different theories focus on different aspects: syntactic approaches look at manipulating the structure of information (like summarizing text by shortening sentences), while semantic approaches focus on preserving meaning (ensuring the summary captures the core message). Related concepts like granularity (the level of detail, like looking at a city map vs. a street map) and reformulation (changing the representation, like discussing spending trends instead of listing individual purchases) interact with abstraction, but abstraction’s core characteristics set it apart.\n\nKey among these characteristics is information reduction – deliberately removing details irrelevant to the current goal. It’s also often an intensional property, relating to concepts and meanings rather than specific instances. Critically, abstraction is relative; what’s abstract depends entirely on the context and the purpose. It’s fundamentally a process, an active transformation, not a static state. And it crucially involves information hiding, intentionally concealing details to simplify focus. While generalization (finding common features to group things) or approximation (replacing precise info with something less exact) are related, abstraction uniquely combines information reduction with purposeful focus on the essential for a specific task.\n\nThe KRA Model: A Framework for Operational Abstraction\n\nOkay, so we understand the concept, but how do we do it, especially in a structured way for AI? This is where the KRA model [1] comes in. Presented as a central framework, KRA provides a systematic way to understand and perform abstraction, making it operational. It defines abstraction not in a vacuum, but tied to:\n\n\n  Query Environment (Q): The specific task or question driving the need for abstraction. The abstraction is performed relative to Q – what is relevant depends on what you are trying to figure out or achieve.\n  Description Frame (Γ): This is the schema or vocabulary that de..."
  },
  
  {
    "title": "Unlocking the Secrets of Tabletop Games Ontology",
    "url": "/unlocking-secrets-of-tabletop-games-ontology",
    "date": "Feb 24, 2025",
    "categories": ["post"],
    "tags": ["Ontology","Tabletop Games","Board Games","Game Design"],
    "excerpt": "\nThere’s a fascinating concept at play in the world of tabletop games, something akin to a secret language. If you can crack this code, you might start to understand why some games rise to legendar...",
    "content": "\nThere’s a fascinating concept at play in the world of tabletop games, something akin to a secret language. If you can crack this code, you might start to understand why some games rise to legendary status while others fade into obscurity. This isn’t just about memorizing rules or mastering strategies; it’s about understanding the Ontology of Games.\n\n\nYou might be asking, what exactly is an Ontology? In the simplest terms, it’s a structured way of organizing knowledge. Think of it as a map of the game universe, not just listing key terms like “worker placement” or “deck building,” but also showing how these terms relate to one another (see Engelstein, Geoffrey, and Isaac Shalev  Building Blocks of Tabletop Game Design: An Encyclopedia of Mechanisms  for the full taxonomy of tabletop games) . It’s more than a glossary; it’s about understanding connections. And those connections, once you see them, allow you to do some pretty cool things.\n\n\n\nFigure. Dune: Imperium is a tabletop game that finds inspiration in elements and characters from the Dune at the center of the conflict is Arrakis. This will be one of our case study.\n\nFor instance, when an ontology includes the concept of resource management and links it to mechanics like dice rolling and trading, patterns emerge. You start to see how these elements create different player experiences. It’s like having a blueprint for understanding what makes a game tick.\n\nThis approach is not just theoretical. Companies like  Palantir - built on ontology-driven business processes to organize and analyze vast amounts of data, making connections and predictions that wouldn’t be possible otherwise. With the rise of AI, ontologies have become even more crucial. They provide the structured information that AI systems need to learn and reason. If we want to train an AI to understand tabletop games, we need to give it a solid framework. That’s where ontologies come in.\n\n\n\n  We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM.\n\n\nimages/unlocking-secrets-of-tabletop-games-ontology/Podcast_Tabletop_Games_Ontology.mp3\n\n\n\n\n\n\nAre you ready to explore ontology as a method for structuring knowledge in tabletop games? This article goes beyond simply knowing the rules, and instead focuses on understanding the relationships between game elements such as game types, mechanics, components, and players. By employing a simplified tabletop game ontology, both designers and players can systematically analyze and design games. By using two popular tabletop games — Catan and Dune: Imperium, we illustrate how ontologies can help in understanding their underlying the deeper structures and mechanics.\n\nWhat is the Ontology of Games?\n\n\n  An ontology is a formal, explicit specification of a shared conceptualization.\n\n\nIn computing excluding the complexity from philosophical traditions, an ontology is a structured knowledge representation that defines concepts, relationships, and categories within a specific domain, in this case, tabletop games. Essentially, it acts as a shared vocabulary that enables intelligent information systems to understand and integrate data more effectively. Unlike traditional databases, ontologies allow for more dynamic and context-aware data management, facilitating interoperability between different applications. Ontologies have been widely used in solving data integration challenges and enhancing artificial intelligence applications, such as question-answering systems, scientific discovery, and automated reasoning, making them a crucial component in modern knowledge-driven technologies.\n\nA comparison between relational databases and ontologies as knowledge bases reveals that, unlike RDBMSs, ontologies (knowledge bases) include the representation of the knowledge explicitly, by having rules included, by using automated reasoning (beyond plain queries) to infer implicit knowledge and detect inconsistencies of the knowledge base, and they usually operate under the Open World Assumption .\n\nThis same flexibility can be leveraged in tabletop game design, where intricate systems of rules, mechanics, and player interactions demand a structured yet adaptable framework. By utilizing a simplified tabletop game ontology, designers can systematically dissect and reimagine the core components of gameplay, fostering innovative designs that balance structure, strategy, and player engagement.\n\nExploring Tabletop Game Design Through a Simplified Ontology\n\nWe know an engaging tabletop game requires a thoughtful balance of structure, strategy, and player interaction. By leveraging a simplified tabletop game ontology, we can break down the essential elements that make a game both effective and enjoyable. This model emphasizes core components - Game Types, Mechanisms, Components, and Players - allowing us to focus on how these elements interact to create compelling gameplay experiences.\n\n\n  Game Types - Basic classification of games (e..."
  },
  
  {
    "title": "Personal Embedding and Fine-Tuning with the FLUX Model on Replicate",
    "url": "/personal-fine-tuning-with-flux",
    "date": "Dec 27, 2024",
    "categories": ["post"],
    "tags": ["Generative AI","Stable Diffusion","Image Generation"],
    "excerpt": "\nImage generation has made tremendous strides in the past six months. As someone deeply invested in exploring the creative potential of models like “Stable Diffusion” and “Dreambooth”, I’ve been th...",
    "content": "\nImage generation has made tremendous strides in the past six months. As someone deeply invested in exploring the creative potential of models like “Stable Diffusion” and “Dreambooth”, I’ve been thrilled to discover the capabilities of the cutting-edge FLUX model. This breakthrough innovation, combined with Replicate’s user-friendly cloud-based platform, has transformed the process of creating personalized embeddings.\n\n\n\n\nFigure. All of the creative imaginations of personal embedding into the multiverses using FLUX.1 model. The possibilities are limitless.\n\nNo longer confined to cumbersome local installations, this updated methodology unlocks a new level of simplicity and accessibility. Whether you’re a creative professional, an AI enthusiast, or just curious about bringing imaginative multiverse versions of yourself to life, this guide will walk you through every step of the journey.\n\nTowards the end, we cover advanced topics like multi-LoRA generation—combining multiple fine-tuned models for unparalleled creativity—and building a microservice for streamlined image generation. These sections will empower you to scale your creative workflows and integrate cutting-edge tools into your own applications. Dive in and discover how the FLUX model, paired with the power of Replicate, can redefine the way we generate and fine-tune personal embeddings.\n\n\n\n  We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/personal-fine-tuning-with-flux/Podcast_Personal_Embedding_with_Flux.mp3\n\n\n\n\n\nWhy Choose the FLUX Model?\n\nFLUX.1 is a family of text-to-image models released by Black Forest Labs this summer. The FLUX.1 models set a new standard for open-source image models: they can generate realistic hands, legible text, and even the strangely hard task of funny memes.\n\nVariants of FLUX.1\n\nThe FLUX.1 series includes three main versions, each tailored to specific use cases:\n\n\n  \n    FLUX.1 pro: A premium, closed-source model designed for commercial use. It offers superior performance and supports enterprise-level customization via the official API.\n  \n  \n    FLUX.1 dev: An open-source, non-commercial version derived from FLUX.1 pro, providing high-quality images and robust prompt adherence while being more efficient. (this article will concentrate on utilizing FLUX.1 dev for the purpose of personal embedding and fine-tuning.)\n  \n  \n    FLUX.1 schnell: An open-source, lightweight model optimized for speed and minimal memory usage, ideal for local development and personal projects.\n  \n\n\nGetting Started: Fine-Tuning the FLUX Model with Replicate\n\nTo ensure that your neural network gets trained properly, it is imperative to provide adequate amounts of images that represent you in a variety of looks, poses and backgrounds. If you only give the AI pictures of you making one pose or wearing one outfit it will only be able to generate images matching this input. Giving your AI a diverse set of images to learn from will ensure a more wide range of options and images.\n\nPrepare your training data\n\nTo start fine-tuning, you’ll need a collection of images that represent the concept you want to teach the model. These images should be diverse enough to cover different aspects of the concept. For example, if you’re fine-tuning on a specific character, include images in various settings, poses, and lighting.\n\nHere are some guidelines:\n\n\n  Use 12-20 images for best results\n  Use large images if possible\n  Use JPEG or PNG formats\n  Optionally, create a corresponding .txt file for each image with the same name, containing the caption\n\n\nOnce you have your images (and optional captions), zip them up into a single file.\n\nExample - Preparing Personal Images for Fine-tuning\n\nBased on Jame Cunliffe’s video tutorial, here’s a recommended breakdown that I found successful:\n\n  3 full-body shots from different angles\n  5 half-body shots from various perspectives\n  12 close-ups showcasing a range of facial expressions\n\n\nResize your images to 512x512 pixels using tools like BIRME or Photoshop. Once resized, compress the files into a zip archive.\n\n\n\nOnce you have your images (and optional captions), zip them up into a single file, e.g. we used bennycheung-portrait.zip in the following example.\n\n\nFine-Tuning with Replicate\n\nReplicate.ai is a powerful platform that streamlines the deployment, execution, and fine-tuning of machine learning models in the cloud. It offers access to a wide array of pre-trained models developed by various researchers and developers, enabling users to tackle tasks such as image generation, natural language processing, and other specialized AI applications.\n\nSetting Up the Environment\n\nWhile Replicate provides a user-friendly interface, developers may prefer a programmatic approach to interact with the platform. This method offers the significant advantage of automating the generation process, eliminating the need to rely o..."
  },
  
  {
    "title": "Seeing Logic - The Power of Existential Graphs and Visual Thinking",
    "url": "/seeing-logic-with-existential-graphs",
    "date": "Nov 10, 2024",
    "categories": ["post"],
    "tags": ["Visual Logic","Knowledge Graph","Existential Graph","AI Education"],
    "excerpt": "\nLogic is inherently challenging—not only difficult to get right, but often tough to even grasp. Abstract concepts do not come naturally to most people, which is why visual logic can be transformat...",
    "content": "\nLogic is inherently challenging—not only difficult to get right, but often tough to even grasp. Abstract concepts do not come naturally to most people, which is why visual logic can be transformative. Visual logic makes logic visible, bridging the gap between abstract reasoning and intuitive understanding. When logic is represented visually, we begin to comprehend it in a more intuitive way, similar to how we understand physical objects like chairs or doors. This matters because understanding logic is not just a niche skill for mathematicians or computer scientists; it is fundamental to how we perceive and interact with the world.\n\n\n\n\nFigure. This cycle emphasizes the power of visual logic in making abstract ideas more accessible and intuitive.\n\n\n\n  We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/seeing-logic-with-existential-graphs/Podcast_Seeing_Logic.mp3\n\n\n\n\n\n\nCharles Sanders Peirce and Existential Graphs\n\nThe idea of visualizing logic is not new. In the 19th century, Charles Sanders Peirce, a philosopher and logician, sought to make abstract ideas concrete. He developed a system called existential graphs, which were diagrams that depicted logical relationships in a visual format. By drawing logic on a blank piece of paper, Peirce aimed to transform logic into something visible, making abstract relationships tangible and accessible. For example, drawing a circle around a statement indicated its negation—”not that.” This simple visual representation allowed individuals to see logical operations rather than solely interpret symbols mentally.\n\n\n\nFigure. Developed by Peirce, existential graphs allow abstract concepts to be represented concretely through visualization of logic.\n\nPeirce’s existential graphs provided a way to represent logical statements visually. For instance, to express “the sky is not blue,” one would write “the sky is blue” inside a circle, with the circle representing negation. This visual approach made abstract logic concrete, allowing individuals to see logical operations rather than interpret abstract symbols mentally.\n\nAlthough it may seem obvious in retrospect, Peirce’s approach was innovative for his time. By representing logic visually, he demonstrated that abstract concepts could become more intuitive when made visible. His existential graphs transformed the invisible workings of logic into tangible diagrams that the brain could readily comprehend.\n\nPeirce emphasized the power of iconic representations, where the structure of a symbol resembles the concept it represents, making it intuitively easier to understand. Existential graphs embody this principle by structuring logical statements in a visual format, using different types of graphs to represent various levels of logic, namely Alpha Graphs and Beta Graphs.\n\nAlpha Graphs: Visualizing Propositional Logic\n\nAlpha graphs are the simplest form within Peirce’s existential graph system, corresponding to propositional logic. In propositional logic, statements are treated as simple units (propositions) that can be combined using logical operations like “and,” “or,” and “not,” without reference to specific entities or variables. Alpha graphs visually represent these statements and their combinations using straightforward conventions:\n\n\n  Letters or symbols represent individual propositions.\n  Enclosures called “cuts” are used to represent negation. Placing a proposition within a cut negates that proposition, analogous to a “not” in traditional symbolic logic.\n\n\nFor example, consider the statement “It is not raining.” In an Alpha graph, the proposition “It is raining” would be written, and a cut (a closed curve) would be drawn around it to indicate negation.\n\n\n\nFigure. This diagram illustrates Peirce’s Alpha Graph notation, which represents propositional logic in a graphical format. The left column shows common logical statements in English, the middle column (F) displays their representation in traditional symbolic logic, and the right column (EG) demonstrates how each statement is expressed in Alpha Graphs.\n\nBy using Alpha graphs, Peirce provided a way to visualize basic logical operations without the need for textual or symbolic manipulation, allowing users to “see” the relationships between propositions. This makes Alpha graphs an iconic and intuitive way to represent propositional logic.\n\nBeta Graphs: Extending to Predicate Logic\n\nBuilding upon Alpha graphs, Beta graphs incorporate predicate logic (also known as first-order logic), which introduces variables and relationships between objects. Predicate logic allows for more complex statements involving individual entities and their properties or relationships, such as “All humans are mortal” or “Socrates is a human.” In Beta graphs, Peirce introduced additional features:\n\n\n  Lines of identity represent specific individuals or objects, connecting parts of the graph ..."
  },
  
  {
    "title": "The Intelligence Age and How to Thrive in It",
    "url": "/intelligence-age-how-to-thrive",
    "date": "Sep 28, 2024",
    "categories": ["post"],
    "tags": ["Human Life","AI","AI Education","Philosophy"],
    "excerpt": "\nHuman existential crisis occurs when someone deeply questions the meaning, purpose, or value of their life. It often leads to feelings of anxiety, uncertainty, or confusion about one’s direction a...",
    "content": "\nHuman existential crisis occurs when someone deeply questions the meaning, purpose, or value of their life. It often leads to feelings of anxiety, uncertainty, or confusion about one’s direction and role in the world. Throughout history, humans have faced existential questions in various forms. Whether it was cavemen pondering their place in the natural world or us navigating the complexities of the Intelligence Age, grappling with meaning and purpose seems to be a timeless part of the human experience.\n\n\nBenjamin Bratton’s recent post, “The Five Stages of AI Grief,” and Avital Balwit’s “My Last Five Years to Work” both explore deep existential questions. It appears that the Intelligence Age is heightening these concerns. How can we address this growing issue?\n\n\n\nFigure. Cavemen’s existential crisis in the Intelligence Age - Just as we are confronting our own existential questions about AI’s role in society today, this artwork suggests that the challenges of adapting to new technologies are timeless.\n\nWe can address these concerns by focusing on what makes us uniquely human. Encouraging education that emphasizes empathy, creativity, and critical thinking can help. Open conversations about the impact of AI can also ease anxieties. By promoting lifelong learning and adaptability, we empower people to find purpose and stay engaged in a rapidly changing world.\n\n\n\n  We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/intelligence-age-how-to-thrive/Podcast_Human_Perceptive_Efficiency.mp3\n\n\n\n\n\n\nThe Intelligence Age and How to Thrive in It\n\nThis is the dawn of The Intelligence Age, as envisioned by the AI visionary, marks a transformative epoch where AI becomes an integral part of our daily lives. Intelligence is no longer a scarce resource but a ubiquitous commodity, reshaping how we access information, make decisions, and interact with the world. Central to thriving in this new era is the enhancement of Human Perceptive Efficiency &amp; Cognitive Efficacy - our ability to perceive, process, and apply information effectively. This article explores strategies for personal excellence and the pivotal role of education in amplifying human potentiality, ensuring we not only adapt to but also possibility of leading in the Intelligence Age.\n\n\n\nFigure. The serene figure in meditation reflects the pursuit of cognitive clarity and balance, while the robotic and mechanical elements represent the growing role of AI in augmenting human cognition and decision-making.\n\nHuman Improvement\n\nOne way to thrive in the Intelligence Age is by improving our unique human ability which involves enhancing how we perceive, process, and use information. With AI managing routine tasks, we have an opportunity to focus on deeper understanding, creativity, and higher-level problem-solving.\n\n\n\nFigure. Key Human Traits for Thriving in the Intelligence Age\n\n\n  Perceptive Efficiency - Maximizing output while minimizing cognitive effort through effective collaboration with AI.\n\n\nFirst, offload repetitive tasks to AI. Instead of spending hours sorting emails or scheduling appointments, use AI tools like virtual assistants and task automation software to handle mundane activities. This frees up valuable time for us to tackle more complex and strategic issues that truly matter.\n\n\n  Cognitive Efficacy - Developing profound understanding and expertise in specific domains.\n\n\nSecond, dive deeper into subjects that interest us by using AI to curate, process, and organize large amounts of information. Engage with learning platforms or research tools that can tailor content to our learning pace. However, it’s crucial to turn this raw information into actionable knowledge by applying critical thinking and reflective methods like the Socratic approach—continuously questioning, discussing, and analyzing ideas to challenge assumptions and enhance understanding.\n\n\n  Connective Density - Expanding cognitive abilities across multiple domains and perspectives.\n\n\nThird, broaden our horizons by connecting different disciplines. AI excels at identifying patterns across vast data sets, allowing us to cross-pollinate ideas from multiple fields. For instance, tools like data visualization software or AI-based research platforms can help us uncover links between areas like psychology, technology, and business, leading to innovative solutions. This interdisciplinary mindset often sparks creativity and can push us toward more groundbreaking discoveries.\n\n\n  Emotional Fluidity - Investing into emotional intelligence in maintaining ethical and empathetic decision-making in an AI-driven world.\n\n\nHowever, thriving in the Intelligence Age isn’t solely about leveraging technology. Emotional intelligence becomes even more vital. While machines can analyze data, they cannot understand human emotions, empathy, or complex social dynamics. Investing in emotional intelligence..."
  },
  
  {
    "title": "Creating a Love Song with AI",
    "url": "/creating-a-love-song-with-ai",
    "date": "Aug 17, 2024",
    "categories": ["post"],
    "tags": ["Generative AI","AI Music","AI Avatar","Movie Making"],
    "excerpt": "\nI’ve always dreamed of writing a song for my love one, but my musical skills are somewhat limited. However, I do have a knack for crafting beautiful love poetry. Thanks to advancements in AI, my d...",
    "content": "\nI’ve always dreamed of writing a song for my love one, but my musical skills are somewhat limited. However, I do have a knack for crafting beautiful love poetry. Thanks to advancements in AI, my dream is now achievable using several innovative tools.\nIn this article, I’ll show you how I combined various AI technologies to create a heartfelt music video:\n\n\n\n  Stable Diffusion: Trained a model specifically for my beloved to personalize the visuals.\n  Hedra: Utilized this beta, free tool for the vocal avatar. Despite some limitations, it works remarkably well.\n  FaceFusion: Employed this tool for deep-faking certain music video scenes to add a touch of magic to my loved one. (video clips credits: Vivian Chow 周慧敏 - 《天荒愛未老》)\n  Suno v3: Provided my poetry as lyrics to this tool, which then composed the music and even supplied the vocal performance.\n\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/creating-a-love-song-with-ai/Podcast_Creating_Love_Song_with_AI.mp3\n\n\n\n\n\n\n\n\nFigure. AI-Powered Love Song Creation Workflow: This diagram illustrates the process of creating a personalized music video using a combination of AI tools. Stable Diffusion was used to generate custom visuals, Hedra created a vocal avatar, and FaceFusion added deep-fake effects to integrate these elements into a music video. Suno V3 was employed to compose and perform a song based on the user’s poetry. The final love song movie was edited and compiled in iMovie before being published on YouTube.\n\nWith a final touch of video editing, I managed to weave all these elements into a seamless and touching music video. This project is truly a dream come true, made possible by the incredible capabilities of AI technology.\n\n\n  Here is my Chinese poem, which served as the input lyrics for Suno.\n\n\n[Verse 1]\n春雨綿綿情如畫， | Spring rain, tender like a painting,\n夏夜星空閃無暇。 | Summer night, stars twinkle without flaw.\n秋葉紅遍庭芳秀， | Autumn leaves, red and vibrant in the garden,\n冬雪飄飄暖如霞。 | Winter snow, warm and floating like rosy clouds.\n\n[Chorus]\n光陰流逝歲無憂， | Time passes without worry through the years,\n共渡晨曦共牽手。 | Together, we greet the dawn, hand in hand.\n風雨同舟不曾變， | Through storm and wind, our bond unchanged,\n地老天荒情深厚。 | As the earth ages, our love remains deep.\n\n[Verse 2]\n佳人如玉美若花， | My beloved, as pure as jade, as beautiful as a flower,\n傾心細訴枕邊話。 | Whispering sweet words by the pillow.\n千言萬語送安康， | Thousands of words wish you peace,\n願君笑顏伴芳華。 | May your smile accompany the blooming of youth.\n\n\nView the result of this project 四季的情话 (Love in All Seasons) - Creating a Love Song with AI\n\n\n  \n\n\nIf you’re interested in how this was accomplished, let’s explore each component of the workflow. You might even be able to replicate the process to create a song for your love one.\n\nStable Diffusion - AI Generated Avatar\n\nStable Diffusion is a free AI tool that can be used to generate high-quality images of faces. To use Stable Diffusion, you need to provide it with some starting parameters, such as gender, age, and hairstyle, and it will generate a realistic image of a face that matches those parameters.\n\nTo generate your personal avatar picture, start by building a personal embedding described in our previous post on Dreambooth Training for Personal Embedding. You can choose any combination of parameters that you like, depending on the kind of avatar that you want to create. Once you have chosen your parameters, run Stable Diffusion to generate your avatar picture.\n\n\n\nFigure. Use Stable Diffusion (via AUTOMATIC1111 Web UI) to generate the personal avatar image from a prompt.\n\nSuno - AI Music Generation\n\nSuno v3 is a fascinating tool for anyone interested in music creation. It takes the concept of text-to-music and runs with it, allowing users to input lyrics or descriptions and receive a full song in return. This isn’t just a simple jingle; we’re talking about complete tracks that include instrumentals, vocals, and lyrics across a variety of genres—pop, rock, jazz, you name it.\n\nWhat’s impressive is the level of customization available. You can specify the mood, instruments, and style, guiding the AI to create something that resonates with your vision. If you’re not satisfied with the first version, you can generate multiple iterations, tweaking and refining until it feels just right. The vocal synthesis feature adds another layer, enabling the AI to sing or rap your lyrics.\n\nDespite its strengths, Suno isn’t without limitations. Some users have noted issues with song structure and repetitive lyrics. That’s why we can give Suno our lyrics instead of the auto-generated one.\n\nSuno represents a significant leap in AI music generation. It democratizes music creation, making it accessible to both seasoned musicians and casual users. If you’ve ever wanted to turn your words into a love song, Suno might just be the tool you need.\n\nSpecify the style of musics tha..."
  },
  
  {
    "title": "Dify - Your Weekend GenAI Magics",
    "url": "/dify-your-weekend-genai-magics",
    "date": "May 05, 2024",
    "categories": ["post"],
    "tags": ["Generative AI","LLM","Education","AI Platform"],
    "excerpt": "\nWhen we think about learning, especially in fields as complex as AI and the blooming arena of generative AI, there’s a natural inclination to dive deep into theory. Books, papers, lectures – these...",
    "content": "\nWhen we think about learning, especially in fields as complex as AI and the blooming arena of generative AI, there’s a natural inclination to dive deep into theory. Books, papers, lectures – these are the traditional tools of knowledge acquisition. They’re valuable, no doubt, but they represent only one side of the coin. The real magic happens when we take that theoretical knowledge and apply it to real-world projects. This is where the abstract becomes tangible, where ideas transform into something you can see, touch, and interact with.\n\n\nGenerative AI, and particularly tools like ChatGPT, have revolutionized this process. Traditionally, the journey from concept to execution in AI was daunting. It could stretch over months or years, acting as a significant barrier to innovation and broader adoption of AI technologies. But now, with the advent of Large Language Models (LLMs), we’re witnessing a dramatic shift.\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/dify-your-weekend-genai-magics/Podcast_Dify_GenAI_Magic.mp3\n\n\n\n\n\n\n\n\nFigure. Knowledge Gaps in Education (image by qorrectassess.com)\n\nChatGPT and similar tools offer a new paradigm for turning ideas into reality. Through what’s known as prompt engineering, we can now interact with these complex AI models using natural language, describing what we want in plain English, and the model generates or completes our thoughts in a structured, repeatable manner. This isn’t just a minor improvement; it’s a fundamental change in how we approach AI development.\n\nHowever, a significant hurdle remains. Transforming these ingeniously crafted prompts into a viable product is not straightforward. Often, a single prompt isn’t enough to accomplish a task; it requires a sequence of coordinated prompts. This necessitates additional tools and platforms to turn the envisioned product into reality.\n\nFortunately, this challenge is not unique and has been recognized by many. After years of research and practical development by members from Tencent Cloud CODING DevOps team, a solution has emerged that promises to bridge this gap effectively. Among various platforms and tools designed to facilitate the transition from concept to product in the generative AI space, Dify stands out as a particularly innovative solution.\n\n\n  NOTE: The practicality and popularity of Dify are evidenced by the impressive milestone it achieved within just 36 hours of its launch, during which over 1,500 applications were created using this open-source project.\n\n\nDify (Do It For Yourselves) is carving out a unique development space. It’s not just another tool in the vast digital toolbox; Dify is a transformative ecosystem that caters to a wide array of users—from those taking their first steps in AI to seasoned developers and large enterprises. This platform stands out for its versatility and robustness, offering solutions across education, prototyping, agent development, and production. Let’s understand how Dify is making a significant impact in these areas.\n\n\n\nFigure. Dify as a platform for many aspects: education, proof of concepts, agentic product and deploying product.\n\nDify as an Education Platform for Learning Hands-on Generative AI Technologies and Implementation Techniques\n\nGenerative AI technologies, in their infancy, present a landscape both vast and uncharted. At first glance, the complexity might seem daunting. Yet, the essence of understanding and utilizing these technologies lies not in mastering a steep learning curve overnight but in embracing a journey of incremental discovery.\n\nAt its core, Dify serves as an educational gateway to the world of generative AI technologies. Dify can fill the gap of education by presenting a visual learning tools. What sets Dify apart is its Visual AI apps orchestration studio, also known as Low Code/No Code method, a feature that allows users to visually piece together AI components into functional applications. This significantly lowers the barrier to entry for individuals without deep programming knowledge, making it an exemplary educational tool. The platform’s Retrieval-Augmented Generation (RAG) pipeline further enriches the learning experience by equipping AI applications with a vast vector database, enabling them to produce accurate and contextually relevant outputs. This feature effectively demonstrates how AI can leverage extensive resources to generate informed responses, providing a practical glimpse into advanced AI functionalities in action.\n\nBuilding RAG in 5 minutes\n\nThe following illustrates a simple RAG pipeline by using Dify workflow components,\n\n\n\nFigure. Using Dify’s Knowledge Retrieval + Chatbot template, it illustrates a RAG pipeline that can be implemented in 5 minutes.\n\nWe can start with Dify’s Knowledge Retrieval + Chatbot template to get started. Let’s explore how this workflow template functions ..."
  },
  
  {
    "title": "Lean AI - How to Reduce LLM Cost?",
    "url": "/lean-ai-reduce-llm-cost",
    "date": "Apr 12, 2024",
    "categories": ["post"],
    "tags": ["Lean AI","LLM","Cost","Optimization"],
    "excerpt": "\nIn the blooming field of Generative AI (GenAI), startups are proliferating like wildflowers after a spring rain. The statistics are staggering, with a veritable boom in the last year alone. But am...",
    "content": "\nIn the blooming field of Generative AI (GenAI), startups are proliferating like wildflowers after a spring rain. The statistics are staggering, with a veritable boom in the last year alone. But amidst this growth, a pressing question looms: how do we steer these GenAI products towards business success? It’s akin to navigating a ship through treacherous waters, where mastery of domain knowledge, an innovative concept, and effective cost management are the stars by which we must set our course..\n\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/lean-ai-reduce-llm-cost/Podcast_Lean_AI.mp3\n\n\n\n\n\n\n\n\nFigure. In 2023, North America alone boasted an impressive tally of over 7,000 AI startups. (image from AI startup 2023 statistics)\n\nWe cannot predict which ventures will survive for the next two years, it is clear that a few startups will stand out flawlessly by aligning crucial business elements. These trailblazers are poised for remarkable growth in the foreseeable future, their success underpinned by distinctive product innovations that effectively address significant challenges within niche markets, coupled with their capability to manage operational costs and secure profitable outcomes.\n\nWe often focus on the biggest, fastest, and most capable LLMs. However, when it comes to business, cost savings are almost always at the forefront of business leaders’ minds. The concern is how we can reap the benefits of AI while keeping costs reasonable for an organization. Whether you’re a tech leader aiming to streamline operations or a developer interested in the economical aspects of AI, this article is well worth your time.\n\nEvolving LLM Features\n\nThe evolution of Large Language Models (LLMs) is a tale of rapid advancement, democratization, and specialization. With half a million models now on Huggingface, an open-source repository for shared models, algorithms, and datasets, the landscape is rich with potential. Yet, for business leaders, this fast-paced environment is a double-edged sword. Keeping abreast of new features and ensuring tech debt doesn’t accumulate is akin to running a marathon at a sprinter’s pace. Model maintenance and upkeep become the key to longevity.\n\nBusiness Leaders need to keep up with these features and how to capitalize on them - tech debt will come quickly - model maintenance and upkeep is the key\n\n\n  Longer Context Window: Expand to potentially over 1 million tokens for in-depth analysis of large data.\n  Advanced Reasoning Capabilities: Enhance reliability and accuracy in responses for broader industry application.\n  Improved Inference Speed and Latency: Aim for quicker responses for smoother, natural interactions.\n  Increased Memory Capabilities: Advance memory retention for coherent long-term interactions.\n  Enhanced Vision Capabilities: Improve image and video analysis tools for cost-efficiency.\n  Multimodality: Boost handling of various inputs (text, images, audio, video) for diverse applications.\n  Increased Personalization: Tailor responses using individual user data for customized interaction.\n\n\nConsider the features that are shaping the future of LLMs, these are but a few of the advancements that promise to redefine our interactions with technology. Yet, with great power comes great responsibility—and cost to be tamed.\n\nCost Awareness\n\nWith the knowing of the LLM future and desirable features in our product, choosing the right LLM is a balancing act. We must weigh time to market against accuracy and quality, performance against cost and scalability, and privacy against the need to protect intellectual property. It’s a delicate dance, where each step must be measured and precise.\n\n\n\nFigure. The triangular framework to represent the critical factors that we must balance, accuracy, performance and privacy.\n\nLet’s use a triangular framework to represent the critical factors involved in selecting the right LLM. At the centre of the triangle is Time to Market underlining the urgency of deployment. Each corner of the triangle highlights a different priority:\n\n\n  Accuracy &amp; Quality indicated by state-of-the-art (SoTA) large models and a broad spectrum of capabilities.\n  Performance characterized by low latency, cost efficiency, and the ability to scale.\n  Privacy emphasizing the importance of safeguarding intellectual property and refraining from disclosing sensitive information.\n\n\nWe understand the cost of LLMs is not fixed; it’s as variable as the weather. To navigate these shifting sands, we must understand the forces at play: accuracy, performance, and privacy. Balancing these forces is crucial for timely market delivery without compromises. It’s about knowing which levers to pull—and one such controllable lever, that we inspect in greater details, is about cost reduction.\n\nCost Estimation\n\nContinue with our analogy on predicting the weather, cost est..."
  },
  
  {
    "title": "Exploring LLMs Through Minsky's Lens on Universal Intelligence",
    "url": "/exploring-llms-on-universal-intelligence",
    "date": "Dec 10, 2023",
    "categories": ["post"],
    "tags": ["Philosophy","Universal Intelligence","LLM","AI"],
    "excerpt": "\nAt the recent World Science Festival, the discussion titled “AI: Grappling with a New Kind of Intelligence” proposed the idea that AI could be seen as an evolving “being”, rather than merely a too...",
    "content": "\nAt the recent World Science Festival, the discussion titled “AI: Grappling with a New Kind of Intelligence” proposed the idea that AI could be seen as an evolving “being”, rather than merely a tool. This thought brought me back to a mesmerizing read from April 1985’s Byte magazine. The piece, “Communication with Alien Intelligence”, written by the AI visionary Marvin Minsky, exploring the possibility of reaching out to minds from beyond our world.\n\n\nMinsky argues that intelligent beings, regardless of their origin, face similar constraints and would develop comparable concepts and problem-solving methods. He discusses principles like the economics of resource management and the sparseness of unique ideas, suggesting these are universal. Minsky also addresses potential limitations in understanding vastly different intelligences and emphasizes the importance of basic problem-solving elements in intelligent communication.\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/exploring-llms-on-universal-intelligence/Podcast_Minsky_Universal_Intelligence.mp3\n\n\n\n\n\n\n\n\nFigure. An illustration that metaphorically represents a conversation between human intelligence and artificial intelligence.\n\nThis perspective allows us to rethink the capabilities of Large Language Models (LLMs) and how problem solving techniques could be applied to current constraints.\n\nUniversal Problem-Solving in AI\n\nAI as a form of intelligence has the potential to compete with humans, by developing its current capabilities through principles of universal problem-solving.\n\nSpecifically, the concept of universal problem-solving in AI, particularly in the context of LLMs, could include:\n\n\n  Pattern Recognition: This ability to detect and assimilate patterns is a core attribute of human intelligence. Its effectiveness is dependent on a robust memory architecture.\n  Language Understanding: Comprehension of language involves the interpretation of words and symbols, allowing an AI or human to construct meaningful responses.\n  Adaptation and Learning: In machine learning, adapting to new information through changes in behaviour is a data-driven and algorithmic way of progression.\n\n\nThese examples draw parallels between artificial and natural forms of problem-solving.\n\nLLMs and the Economics of Resource Management\n\nExpanding on Minsky’s discussion, LLMs can be viewed as systems that manage computational resources in order to optimize problem-solving. This involves balancing the allocation of computational power, memory, and data processing capabilities to efficiently process and generate language. Understanding the strategies LLMs use for resource management is key to understanding their operational mechanics and efficiency.\n\nThis involves making informed decisions about the allocation and utilization of limited assets to achieve specific goals efficiently and effectively. Intelligent systems, whether biological or artificial, must often operate within constraints - be it energy, time, materials, or computational power. This ability is not just about conserving resources but also about maximizing their potential impact, a crucial aspect of both human and artificial intelligence.\n\nFor instance, when an LLM processes a complex sentence, it must understand the context, maintain the flow of the conversation, and generate a relevant response. This requires a delicate balance of using enough resources to perform these tasks effectively, but not so much that it becomes inefficient. In the context of training LLMs, resource management involves effectively utilizing datasets and computational power. Adequate computational resources are critical for processing extensive datasets and experimenting with various model architectures. However, it is essential to note that avoiding overfitting or underfitting in these models primarily hinges on the quality and diversity of the training data, as well as the appropriate complexity of the model and the implementation of effective regularization techniques.\n\nThe Sparseness of Unique Ideas\n\n“The Sparseness Principle” provides a framework for understanding the nature of intelligence as it pertains to the generation of unique ideas. This principle posits that the universe of possible computational structures is vast, yet the processes that yield significant outputs are few, leading to a natural scarcity of truly unique ideas.\n\nMinsky’s Experiment on the Sparseness Principle\n\n\n  The Sparseness Principle: When relatively simple processes produce two similar things, those things will tend to be identical!\n\n\n\n\nFigure. A universe of possible computational structures (image credit: Byte magazine, April 1985).\n\nThis original diagram represents a visual abstraction from Minsky’s technical experiment on the Sparseness Principle, which is closely tied to the concept of Turing machines as described by Alan..."
  },
  
  {
    "title": "Cracking the Spell of Q* - A New Method in Problem Solving",
    "url": "/q-star-new-method-in-problem-solving",
    "date": "Nov 28, 2023",
    "categories": ["post"],
    "tags": ["AI","Algorithms","Q-Learning","LLM","Problem-Solving"],
    "excerpt": "\nIn the dynamic and ever-changing landscape of AI and computational problem-solving, there arises a new speculative yet intriguing proposal: Q* (pronounced “Q-star”) - This conceptual framework is ...",
    "content": "\nIn the dynamic and ever-changing landscape of AI and computational problem-solving, there arises a new speculative yet intriguing proposal: Q* (pronounced “Q-star”) - This conceptual framework is imagined to position itself at the intersection of three diverse yet interrelated technologies: the A* algorithm, Q-learning, and Large Language Models (LLMs) as systems of compressed knowledge. Although still theoretical, this amalgamation of Q* suggests a synergy that could potentially push the boundaries of AI’s problem solving efficiency and effectiveness, offering a novel approach to tackling complex challenges.\n\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/q-star-new-method-in-problem-solving/Podcast_Spell_of_QStar.mp3\n\n\n\n\n\n\n\n  Disclaimer: While it’s probable that our interpretation of the Q* concept doesn’t align with the mysterious breakthrough by OpenAI mentioned in recent Reuters’s news article, this investigative article has served a great inspiration to solve our AI problems.\n\n\n\n\nFigure. An abstract representation of the concept Q* incorporating the elements of A* search, LLM memory, and Q-learning method in reinforcement learning\n\nThe A* algorithm, renowned for its search capabilities, has long been the go-to method in navigating through complex spaces, whether in gaming environments or robotic path planning. Its heuristic-based approach enables it to make informed guesses about the most efficient paths, dramatically reducing the time and computational resources required to reach a destination. However, the static nature of A* limits its ability to adapt to dynamic environments or learn from past experiences.\n\nEnter Q-learning, a form of reinforcement learning that excels in environments where adaptability and experience-based decision-making are key. Q-learning allows an agent to learn from the consequences of its actions, adapting its strategy to maximize cumulative rewards. This learning capability makes it an ideal partner for A*, bringing a dynamic, adaptive edge to the algorithm’s robust search mechanism.\n\nThe third pillar of Q*, LLMs as compressed knowledge systems, introduces a new dimension to this equation. These models, epitomized by systems like GPT-4, represent a paradigm shift in handling and processing vast amounts of information. By compressing global knowledge into a probabilistic model, LLMs offer a unique form of memory system that Q* can exploit. This integration allows Q* to not only navigate and learn from its environment but also to tap into a wealth of knowledge that can inform and enhance its decision-making processes.\n\nThe potential impact of Q* in advanced computational problem-solving is immense. From navigating complex data landscapes to solving intricate real-world problems, Q* stands poised to redefine the boundaries of what is possible in AI. Its ability to combine efficient search, adaptive learning, and deep, compressed knowledge opens up new avenues for exploration (on search) and exploitation (on knowledge).\n\nAs we continue in this article, we will explore the individual components of Q*, how they synergistically interact, and the practical applications of this new approach. Join us on this journey into the heart of Q*, a beacon of the future of problem-solving in the age of intelligent agents.\n\n\n  Why denoted the method as Q*?\n\n  In the method of Q-learning, “Q” represents “quality,” denoting the effectiveness or value of a specific action within a particular state. Fundamentally, it serves as an indicator of the desirability or suitability of undertaking a given action in a certain context. Additionally, “Q*” carries a symbolic significance, representing the highest Q value achievable across all possible search policies, signifying the most optimal action to be taken. Alternatively, the “*” can come from A* algorithm which is providing the optimal search policy for the solution space.\n\n\nBackground Knowledge\n\nThe A* Algorithm\n\nAt the heart of Q* lies the A* algorithm, a well-known AI tool in the world of search and navigation. A* is a heuristic-based algorithm, which means it makes educated guesses to find the shortest path between two points. This approach distinguishes it from brute-force methods, enabling it to solve complex search problems efficiently.\n\n\n\nFigure. A* algorithm animation on optimal path finding (image credit: imgur comacomacomacomachameleon)\n\nTo understand A*, let start with the key features in the algorithm:\n\n\n  \n    Heuristic Function: The core of A*’s efficiency lies in its heuristic function. This function estimates the cost to reach the goal from a given point, often using a measure like straight-line distance. This estimation helps A* to focus its search in the direction of the goal, significantly speeding up the process.\n  \n  \n    Cost Calculation: A* calculates two types of costs - the $g$ cost, which..."
  },
  
  {
    "title": "Spatial Reasoning Under Uncertainty",
    "url": "/spatial-reasoning-under-uncertainty",
    "date": "Nov 25, 2023",
    "categories": ["post"],
    "tags": ["Spatial Reasoning","Philosophy","Mereology","Mereotopology","AI"],
    "excerpt": "\nSpatial reasoning is a critical aspect of many real-world applications, including urban planning, environmental monitoring, and transportation logistics. It involves the processing and interpretat...",
    "content": "\nSpatial reasoning is a critical aspect of many real-world applications, including urban planning, environmental monitoring, and transportation logistics. It involves the processing and interpretation of spatial data to understand and analyze the relationships between objects in a given space. Traditional GIS systems and geometric processing algorithms often assume that spatial data is precise and accurate. However, real-world spatial data is often imprecise, and incorporating this uncertainty into spatial reasoning can lead to more robust and reliable results. The ability to handle imprecise spatial data, much like human cognitive abilities, is a valuable asset in the field of spatial reasoning and analysis.\n\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/spatial-reasoning-under-uncertainty/Podcast_Spatial_Reansoning_under_Uncertainty.mp3\n\n\n\n\n\n\n\n\nFigure. Illustraing the complexity of spatial reasoning under uncertainty. (image credit: Stable Diffusion AI)\n\nThroughout the post, we will provide imaginary town of “Sensorville” as the examples to illustrate the importance of accounting for uncertainty in spatial decision-making processes. By the end of this article, you will have gained valuable insights into the methods and techniques used to navigate spatial reasoning under uncertainty and their potential applications in various domains.\n\n\n  In our previous discussion titled Spatial Reasoning in AGI - Insights from Philosophical Perspectives, we highlighted the critical role of spatial reasoning abilities for achieving Artificial General Intelligence (AGI). It has become evident that current Large Language Models (LLMs) fall short in spatial comprehension, which is essential for tackling numerous practical challenges. The trend is now shifting towards the incorporation of Knowledge Graph techniques to enhance LLMs with a more concrete connection to the real world. For those who are keenly following this development, the article provides valuable insights on constructing Knowledge Graphs that incorporate spatial relationships amidst uncertainty.\n\n\nSpatial Data Uncertainty\n\nSpatial data uncertainty is a critical aspect to consider when dealing with real-world applications like urban planning, land use decision-making, and infrastructure requirements. Unlike the textbook GIS and computational geometry, real-world data often contains uncertainties that can significantly impact analysis and decision-making processes. By understanding the sources of uncertainty, we can determine whether to address them upfront or incorporate spatial reasoning techniques to resolve them during processing.\n\nThe uncertainty can be arises from many sources, including measurement errors, attribute inconsistencies and geometric uncertainties. The data collection process can easily be tinted with incomplete or missing data, data quality issues and data integration challenges can all contribute to uncertainty in spatial processing tasks.\n\nWhen spatial data cannot be resolved upfront by eliminating the causes of uncertainty, it is crucial for us to develop strategies to manage it. To understand the problem more initutively, let’s tell the story of Sensorville.\n\nThe Story of Imaginary Town - Sensorville\n\nOnce upon a time, in a bustling city named Sensorville, a group of urban planners were faced with the challenge of installing a new network of emergency sensors to protect the residents from a variety of potential hazards such as earthquakes, floods, and fires. The city was divided into numerous distinct zones, each with its unique characteristics, risk levels, and spatial layouts.\n\n\n\nFigure. Illustrate a group of urban planners in the city of Sensorville. (image credit: Stable Diffusion AI)\n\nThe head of the planning team, Dr. Mereo, was an expert in spatial reasoning and mereology. He knew that by understanding the relationships between the zones and their parts, as well as reasoning about the spatial properties of the environment, they could make informed decisions on where to place the sensors.\n\nWe shall see how the story developed.\n\nSpatial Reasoning with Mereology and Mereotopology\n\nSpatial reasoning is a broader concept that encompasses the ability to understand, manipulate, and draw conclusions about objects and their spatial relationships. Inherently, it does not have a built-in theoretical formulation. Through the field’s historically development, we found Mereology and Mereotopology provide the necessary theoretical frameworks to represent and reason about spatial relationships and address the uncertainties in spatial data. In this section, we will introduce these concepts with minimum symbolic notations to discuss their potential applications in handling spatial uncertainty.\n\nMereology: The study of part-whole relationships\n\nMereology is a branch of formal ontology that deals with the study of ..."
  },
  
  {
    "title": "X.509 Identity for Attribute-based Encryption",
    "url": "/identity-for-attribute-based-encryption",
    "date": "Nov 19, 2023",
    "categories": ["post"],
    "tags": ["PKI","Identity","Cryptography","Attribute-based Encryption","Python"],
    "excerpt": "\nIn the physical world, we trust the identity cards issued by a well known organization, including the government.\nThe verification process is a visual inspection of the card authenticity. Advancin...",
    "content": "\nIn the physical world, we trust the identity cards issued by a well known organization, including the government.\nThe verification process is a visual inspection of the card authenticity. Advancing into the digital realm, we are relying on the Public Key Infrastructure (PKI) to securely identify the participants. This is the key cryptographic technology that enables our internet commerce today! Coupling with the advanced attribute-based encryption, we shall see how to use PKI identity to support a flexible access control to the protected records.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/identity-for-attribute-based-encryption/Podcast_Personal_Identity_ABE.mp3\n\n\n\n\n\n\n\n\nFigure. Artistic style of Kandinsky transfer to a many faces image\n\nIn our previous article on “Attribute-based Encryption for Healthcare Blockchain”\nexplained how to apply attributed-based encryption (ABE) to electronic health records (EHR),\nhelps to layout the roadmap for a secured system implementation.\nHowever, due to the focus on ABE, we just wrote a short paragraph touching on the participant identities and attributes.\nWe assumed the knowledge on public key infrastructure (PKI), X.509 certificates and certificate authority (CA).\nIn addition, the application-level attributes is magically registered into the X.509 certificate.\nThe participant’s identity and attributes are fully secured and authenticated, i.e. no one can fake their identity and attributes. This is time for us to explain this cryptographic component in greater details.\n\nTo continue our explorations, we shall take the practical approach to demonstrate how to use Python cryptography to generate X.509 certificate with custom atributes; subsequently, we use charm-crypto framework’s hybrid adapter to perform CP-ABE (ciphertext-policy) with the X.509 custom attributes.\n\nPython Installation\n\nVirtual Environment\nUsing an isolated Python virtual environment will protect you from headaches and disaster of installations.\ncrypto (or your choice of name) is the name of the virtual environment, and python=3.5 is the Python version.\n\nconda create -n crypto python=3.5\n\n\nPress y to proceed. This will install the Python version and all the associated anaconda packaged libraries at `{path_to_anaconda_location}/envs/crypto\n\nThen activate the virtual environment by,\n\nsource activate crypto\n\n\ncryptography Module\nOur experiment starts with cryptography module,\ncryptography includes both high level recipes and low level interfaces to common cryptographic algorithms such as symmetric ciphers, message digests, and key derivation functions. We rely on many features from cryptography to illustrate the X509 certificate processing.\n\nMake sure the cryptography installation is inside the virtualenv.\n\npip install cryptography\n\n\nCharm Framework\nOur experiment is using the Python implementation of Attribute-based Encrpytion based on the charm-crypto framework [AGM13].\nCharm is designed for rapidly prototyping advanced cryptosystems. It was a well engineered framework that uses a hybrid design: performance intensive mathematical operations are implemented in native C modules, while cryptosystems themselves are written in a readable, high-level language. Charm additionally provides a number of new components to facilitate the rapid development of new schemes and protocols. That’s how we did the CP-ABE experiment.\n\nInstalling Charm from source is straightforward. First, verify that you have installed the following dependencies:\n\n  GMP 5.x\n  PBC\n  OPENSSL\n\n\nMake sure the Charm installation is from the virtualenv.\n\nNote: --enable-darwin is for MacOS installation\n\n./configure.sh --enable-darwin\nmake install\n\n\nTo validate the installation is working,\nmake test\n\n\nPublic Key Infrastructure\nUnderstanding the Public Key Infrastructure (PKI) standard will help us to design a secure communication between various network participants. The PKI enable us to provide identity to the participants and to ensure that messages on the system are properly authenticated.\n\nThere are four key elements to PKI:\n\n  Public and Private Keys\n  Certificate Authorities\n  Digital Certificates\n  Certificate Revocation Lists (not discussion here)\n\n\n\ncredit: picture from Hyperledger Fabric documentation\n\nWhen obtaining a certificate from a Certificate Authority (CA), the usual flow is:\n\n\n  Generate a private/public key pair.\n  Create a request for a certificate (CSR), which is signed by your key (to prove that you own that key).\n  You give your CSR to a CA (but not the private key).\n  The CA validates that you own the resource (e.g. domain and/or attributes) you want a certificate for.\n  The CA gives you a certificate, signed by them, which identifies your public key, and the resource you are authenticated for.\n\n\nWe shall try to understand each mentioned components with experimental Python code in the fol..."
  },
  
  {
    "title": "Building StuG III Model in Winter Camouflage",
    "url": "/building-stug-iii-winter-camouflage-model",
    "date": "Sep 15, 2023",
    "categories": ["post"],
    "tags": ["Scale Miniatures","Diorama","Modelling","Process"],
    "excerpt": "\nAs we set forth on another exhilarating modeling journey, this time exploring into the realm of the World War 2 German StuG III anti-tank vehicle. This endeavor not only guarantees a captivating b...",
    "content": "\nAs we set forth on another exhilarating modeling journey, this time exploring into the realm of the World War 2 German StuG III anti-tank vehicle. This endeavor not only guarantees a captivating building experience but also offers an exploration of a pivotal fragment of history. I am looking forward to the delight and fulfillment that accompanies the creation of this scale model piece, eager to adopt the techniques showcased by fellow enthusiasts. As we reach the culmination of this project, I apply the power of Generative AI techniques to enhance the final photographic outcomes, adding a modern twist to this historical recreation.\n\n\n\n\nFigure. The completed World War 2 German StuG III tank (from Tamiya 1/35 StuG III Ausf.G model kit) adorned with a winter camouflage, complemented by various weathering elements, showcasing its final appearance.\n\nBuilding and Painting\n\nPutty Surfacing\n\nIn the reference tutorial, I stumbled upon a remarkable technique that has significantly elevated the realism and texture of my creations. This method involves utilizing putty (Tamiya Putty), a versatile material, to add a rich, tactile dimension to the otherwise smooth plastic surface of the model.\n\nBy skillfully manipulating the putty, I was able to mimic the rugged and coarse texture of metal, a transformation that not only enhances the visual appeal but also imparts a sense of weight and depth to the tank. This nuanced approach creates an illusion of a battle-hardened vehicle, adding layers of history and stories to its facade.\n\n\n\nFigure: Displayed in the top image is the added putty texture and the custom-built rack situated at the back, while the bottom image showcases the painted outcome, highlighting the rough putty texture that bestows the model with an illusion of greater depth and heft.\n\nFurthermore, I took a restrained approach to the weathering process, focusing on mastering the delicate art of chipping. This technique, when combined with the following sections, adds a level of authenticity that truly brings the model to life. It creates a narrative of a machine that has witnessed the ravages of war, bearing marks that speak of its journey and experiences.\n\nPainting and Weathering\n\nI find myself thoroughly enjoying the vibrant process of painting this StuG III Ausf.G, equipped with the formidable 105mm assault gun. This phase has been a playground of exploration, where I’ve been able to apply a range of new techniques gleaned from the expertise of various seasoned modellers. It’s a journey of continuous learning and experimentation, adding layers of depth and realism to the model.\n\nLayering the Base Coats\n\nThe process of building up the base layers is both methodical and artistic, involving a series of steps that each add a new dimension to the painted layers. Here’s a glimpse into the layers that form the canvas for the detailed artwork that follows:\n\n\n\n\n  Hull Red: This initial layer sets the tone, providing a deep and rich base that hints at the underlying metal structure of the tank.\n  Dark Sand: Following the red, a layer of dark sand is applied, adding a contrast and depth that starts to bring the texture and contours of the tank to life.\n  Chipping: At this stage, strategic chipping is introduced to create a worn, battle-hardened appearance, adding a realistic touch of age and wear.\n  Basic Shading: This layer involves the subtle art of shading, enhancing the three-dimensional aspect of the model and highlighting the intricate details of its design.\n  Camouflage Pattern: Next, a camouflage pattern is carefully applied, adding a tactical and aesthetic element that blends artistry with historical accuracy.\n  Chipping (Again): To finish off, a final round of chipping is applied, further accentuating the rugged and battle-worn persona of the tank, ready to tell its story through the ensuing layers of weathering and detailing.\n\n\nAs I progress through each layer, the model transforms, adopting a personality and a narrative that delight the viewers.\n\nOil Dots Surfacing\n\nAdopting a technique that involves the strategic application of diverse oil color dots, I have managed to infuse the tank’s surface with a rich tapestry of tones. By skillfully blending and blurring these hues, the surface transcends its initial state, morphing into a canvas that portrays metal’s weathering complexity, thereby enhancing depth and intriguing visual narrative.\n\n\n\nFigure: Showcasing the utilization of varied oil color dots applied to the tank’s surface, which, after blending and natural brush strokes, culminates in a finish that amplifies the realism of the tank’s surface hues.\n\nThe following photographs showcasing the completed base coat, a vital step that lays the groundwork for the subsequent painting and weathering phases.\n\n\n\nFigure. Illustrating the initial base coating of the tank, a layer that will largely be concealed by the subsequent application of the winter white wash.\n\nBringing the Tank Crews to Life\n\nThe journey of re..."
  },
  
  {
    "title": "Journey of Building Scale Model Dioramas",
    "url": "/journey-of-building-scale-model-dioramas",
    "date": "Jul 11, 2023",
    "categories": ["post"],
    "tags": ["Scale Miniatures","Diorama","Modelling","Process"],
    "excerpt": "\nDiving into the immersive world of modelling and miniatures, one finds the ability to freeze history in time, captured within a meticulously crafted diorama. A fascinating blend of history and cra...",
    "content": "\nDiving into the immersive world of modelling and miniatures, one finds the ability to freeze history in time, captured within a meticulously crafted diorama. A fascinating blend of history and craftsmanship await us in the creation of a scale model diorama. Though it may seem like a challenging task, a systematic approach can turn our imagined WWII scene into a tangible reality.\n\n\nLet us embark on this journey to create an evocative diorama of a Flakvierling 20mm anti-aircraft gun system mounted on an Opel Blitz. This scene, set against the backdrop of an Eastern Front stone barn house during WWII, tells a riveting tale of its own. The figures, a mixture from D-Day Miniatures and Evolution at 1/35 scale, are carefully painted in acrylic to breathe life into this historic moment. Follow this guide, step-by-step, to learn the way of building this piece of history.\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/journey-of-building-scale-model-dioramas/Podcast_Scale_Model_Diorama.mp3\n\n\n\n\n\n\n\n\nFigure. Here is the final presentation of our meticulous work - a scene set on the Eastern Front during World War II. The centrepiece is the rugged Opel Blitz truck equipped with the formidable Flakvierling 38 anti-aircraft guns. Completing the scene is a motorcyclist pausing in his journey, asking for directions. The details in the setting and the figures bring to life the struggles and the mundane moments experienced even amidst the chaos of war.\n\nThe Process\n\nCreating a scale model diorama begins with planning our scene, deciding on its scale and size, and selecting appropriate model kits, figures, and components. After conducting necessary research, we build the components and test their fit with a dry fitting. we then paint all elements according to our chosen theme before fitting them onto the base again. Following this, we design and paint the base itself. The final steps involve arranging all the components on the base, blending them into the environment, and fixing them into place.\n\n\n\nFigure. The presented diagram is a comprehensive visualization of our diorama building journey. From the initial sparks of imagination through to the tangible reality, this chart guides us through each stage of the creative process.\n\nAs depicted in the diagram, the diorama building process, much like software architecture, operates as a systematic and methodical workflow. Each step in the procedure is repeatable, allowing for consistent results while fostering creativity and innovation. The workflow, from the first spark of imagination to the final, tangible model, guides us on a journey of transformation, mirroring the systematic yet creative process integral to software architecture.\n\nPreparations\n\nPreparation is key when starting a new project. A diorama starts as an idea, a scene that we’d like to capture. This can come from our imagination, historical photos, or even a favourite movie scene.\n\nHistorical Background\n\nThe Flak 30 (Flugzeugabwehrkanone 30) and improved Flak 38 were 20 mm anti-aircraft guns used by various German forces throughout World War II. It was not only the primary German light anti-aircraft gun but by far the most numerously produced German artillery piece throughout the war. It was produced in a variety of models, notably the Flakvierling 38 which combined four Flak 38 autocannons onto a single carriage.\n\nThe term Vierling literally translates to “quadruplet” and refers to the four 20 mm autocannon constituting the design.\n\n\n\nThe Flakvierling four-autocannon anti-aircraft ordnance system, when not mounted into any self-propelled mount, was normally transported Sd. Ah. 52 trailer, and could be towed behind a variety of half-tracks or trucks, such as the Opel Blitz and the armoured Sd.Kfz. 251 and unarmored Sd.Kfz. 7/1 and Sd.Kfz. 11 artillery-towing half-track vehicles.\n\n\n\nScale, Vehicles and Equipments\n\nNext, determine the scale, size, and base of our diorama. The scale should be proportionate to the figures we plan to include and the size depends on the area we have available for the diorama.\n\nNow comes an exciting phase - the selection of model kits and figures. For this project, we’re specifically looking at 1/35 scale models to maintain a consistent level of detail and realism across the diorama. The world of model kits offers numerous choices that can cater to our specific requirements.\n\n\n\nFor the centerpiece of our diorama, we’re selecting the Opel Blitz German Truck Type S model kit made by Italeri. This will serve as the mobile platform for our anti-aircraft system.\n\nNext, we choose the German 20mm Flakvierling 38 anti-aircraft guns, a kit made by Tamiya. This model will add a potent defensive presence to our scene, and it’s the perfect fit for our Opel Blitz truck.\n\nKeep in mind that we’ll be modifying the Opel Blitz vehicle model to accommodate the Flak 38 guns, ..."
  },
  
  {
    "title": "Enhancing Biblical Study with ChatGPT",
    "url": "/enhancing-biblical-study-with-chatgpt",
    "date": "Mar 28, 2023",
    "categories": ["post"],
    "tags": ["Biblical Study","ChatGTP","AI"],
    "excerpt": "\nThe study of the Bible is an enriching and transformative journey, offering profound insights into our faith, personal growth, and understanding of the world. While traditional methods of biblical...",
    "content": "\nThe study of the Bible is an enriching and transformative journey, offering profound insights into our faith, personal growth, and understanding of the world. While traditional methods of biblical study, such as personal reading, attending classes, and consulting commentaries, remain essential, modern technology has opened up new possibilities for engaging with the sacred text. One such tool is ChatGPT, a powerful AI language model developed by OpenAI, capable of providing valuable assistance and insights for biblical studies.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/enhancing-biblical-study-with-chatgpt/Podcast_ChatGPT_Biblical_Study.mp3\n\n\n\n\n\n\n\n\nIn this article, we explore the various ways in which ChatGPT can enhance your biblical study experience, covering topics such as understanding context, translation comparisons, word studies, cross-references, interpretation assistance, theological concepts, biblical themes, historical background, exegesis, hermeneutics, commentaries, and practical application. While ChatGPT is not a replacement for traditional study methods or the guidance of trained theologians, it can serve as a helpful supplementary resource, providing immediate access to a wealth of information and insights.\n\nJoin us as we delve into the potential of ChatGPT as a tool for biblical study, and discover how this powerful AI language model can support your journey towards a deeper understanding of the Bible, enriching your faith and spiritual growth.\n\n\n  All examples in this article are tested and generated by the latest OpenAI GPT-4, which is available via ChatGPT Plus Subscription.\nGPT-4 is not a requirement to utilize ChatGPT effectiveness in Biblical Study, regular ChatGPT is able to generate equally effective (but shorter) response.\n\n  The full answers are not included in this article due to the length. If reader is interested to read how ChatGPT answers to the example questions, please leave a comment and we shall response to you.\n\n\n\n\n\nFigure. Illustrated selecting the GPT-4 used by ChatGPT.\n\nUnderstand How to Interact with AI\n\nWe start with the understanding of how to interact with AI effectively. By learning how to communicate your needs and inquiries clearly, you can get the most out of your AI-assisted biblical study experience. We will explore tips and strategies for asking questions, providing context, and engaging in meaningful conversations with AI to enhance your understanding of biblical concepts and teachings.\n\nAI language models like ChatGPT act as mirrors in many ways, reflecting the user’s interests and knowledge on a given topic. Let’s explore these aspects in more detail:\n\n\n  \n    Your interest/preference: When engaging with an AI language model, it responds to your specific questions and inquiries, tailoring its responses based on the subjects and themes you express interest in. By doing so, it helps cater to your personal preferences, providing you with information and insights that are most relevant to you.\n  \n  \n    Your knowledge of the topic: AI language models can adapt their responses to the level of knowledge or expertise you display. By analyzing the questions and vocabulary you use, ChatGPT can provide more general or detailed responses, depending on your current understanding of the subject. This adaptability makes AI language models useful tools for users with varying levels of expertise in a particular area, such as biblical studies or any other field of interest.\n  \n\n\nContext, Context, Context\n\nAI generates an answer based on context for a given question. While traditional search engines like Google provide links to sources that might contain the information you’re seeking, AI language models aim to generate a personalized and summarized response based on the context provided in your question.\n\n\n\nThe context provided by the user plays a crucial role in determining the accuracy and relevance of the AI-generated response. Without sufficient context or clarification, ChatGPT might misinterpret the question and provide an answer that doesn’t align with the user’s intended meaning.\n\nFor examples,\n\n  If the user’s question is related to mathematics and includes terms like “sin,” “cosine,” or other mathematical concepts, ChatGPT will recognize the context and generate an answer related to trigonometry.\n  If the user’s question is related to Christianity and the term “sin” is used in a religious context, ChatGPT will generate an answer related to the biblical understanding of sin.\n\n\nUnless the user clarifies context, we can see that AI may misinterpret the question.\n\nBegin with Clear Context\n\nStarting your inquiry with a clear context is a powerful approach to ensure an AI language model understands your intent. By specifying a book or resource in your question, you can provide immediate context that helps generate more accur..."
  },
  
  {
    "title": "Spatial Reasoning in AGI - Insights from Philosophical Perspectives",
    "url": "/spatial-reasoning-in-agi",
    "date": "Mar 26, 2023",
    "categories": ["post"],
    "tags": ["AI","AGI","Philosophy","Spatial Reasoning"],
    "excerpt": "\nSpatial understanding is indeed an important aspect of achieving Artificial General Intelligence (AGI), which refers to machines possessing human-level intelligence across a wide range of tasks an...",
    "content": "\nSpatial understanding is indeed an important aspect of achieving Artificial General Intelligence (AGI), which refers to machines possessing human-level intelligence across a wide range of tasks and domains. Despite the belief that advanced Large Language Models (LLMs), such as GPT-4, demonstrate some AGI capabilities, these models may encounter difficulties when explaining concepts requiring spatial reasoning skills.\n\n\nLLMs excel at processing and generating text on a vast scale; however, they can struggle to convey ideas that involve understanding and manipulating visual and spatial relationships between objects in a three-dimensional space. This limitation arises because spatial reasoning skills are not easily communicated through language alone.\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/spatial-reasoning-in-agi/Podcast_Spatial_Reasoning_AGI.mp3\n\n\n\n\n\n\n\n\nFigure. An illustration inspired by the artistic styles of M.C. Escher, Salvador Dalí, and Wassily Kandinsky that represents the concept of AGI (Artificial General Intelligence) requiring spatial reasoning and cognitive abilities. (image credit: Stable Diffusion).\n\nIn this article, we express our dissatisfaction with existing AI systems’ limitations in AGI and propose a broader perspective that incorporates philosophical ideas. By examining the works of philosophers who have studied human cognition and spatial reasoning, we aim to explore new research directions that may bring us closer to the AGI goal. We argue that understanding the underlying principles of human intelligence and incorporating these insights into AI systems could lead to models with improved versatility, adaptability, and spatial reasoning capabilities, ultimately advancing the pursuit of AGI.\n\nWhy Spatial Reasoning?\n\nSpatial reasoning is a subfield of artificial intelligence that enables a computer to comprehend its surroundings based on its position. This involves identifying objects within the environment and then skillfully manipulating them in a practical manner. Applications of spatial reasoning include navigation, object manipulation, and environmental interpretation. It’s currently employed in areas such as GIS, robotics and gaming.\n\nTo understand personally, imaging a simple puzzle (in the illustration) that requires spatial reasoning skills might be easily solvable by us. We are able to answer the questions of (1) is there a solution to fit all the loose pieces in the space? and (2) how can we pack all the pieces in the space optimally?\n\nWithout a human-like innate ability to comprehend spatial relationship, the puzzle poses a significant challenge for machines. This type of puzzle could involve manipulating objects, visualizing rotations or transformations, or navigating through a complex environment. While humans can often intuitively grasp these concepts, machines may struggle to find an effective approach to tackle these problems without specifically designed representation, algorithms or methods that can handle spatial information.\n\n\n\nFigure. Illustrated an example spatial reasoning domain. Questions to be answered (1) is there a solution? (2) how can we pack the space?\n\nTo engineering, when we are developing spatial reasoning algorithms presents several challenges, such as managing high-dimensional data, dealing with noisy sensor inputs, and achieving real-time performance. The spatial reasoning research is required to devise more reliable and efficient representation and algorithms that is suitable to be used by the machine.\n\nThe development of AI systems with enhanced spatial reasoning capabilities is important to comprehend their surroundings and make informed predictions about future outcomes. However, before diving into the technical challenges associated with the engineering of such a spatial reasoning system, it is essential to explore the philosophical insights that can inform its design. In the next section, we will investigate the contributions of various philosophers, whose work contributed to the nature of human spatial reasoning and cognition. By standing on the philosophical foundations of spatial reasoning, we can more effectively identify the key considerations that should guide the development of advanced AGI systems.\n\nPhilosophy and Spatial Reasoning\n\nIt’s essential to understand our cognitive abilities, which allow us to perceive, process, and act upon spatial information. Although philosophy may not directly involve engineering, its transformative ideas can influence the development principles of AGI systems that exhibit human-like spatial reasoning capabilities.\n\n\n\nFigure. Inspired by M.C. Escher, Salvador Dalí, and Wassily Kandinsky’s styles, like fragmented landscapes, impossible structures, and dreamlike scenes (image credit: Stable Diffusion).\n\nPhilosopher Thoughts on Spatial Cognition &amp; Reasoning\n\nHer..."
  },
  
  {
    "title": "Ask a Book Questions with LangChain and OpenAI",
    "url": "/ask-a-book-questions-with-langchain-openai",
    "date": "Mar 12, 2023",
    "categories": ["post"],
    "tags": ["AI","OpenAI","LangChain","NLP","Python"],
    "excerpt": "\nReading a book can be a fulfilling experience, transporting you to new worlds, introducing you to new characters, and exposing you to new concepts and ideas. However, once you’ve finished reading,...",
    "content": "\nReading a book can be a fulfilling experience, transporting you to new worlds, introducing you to new characters, and exposing you to new concepts and ideas. However, once you’ve finished reading, you might find yourself with a lot of questions that you’d like to discuss. Perhaps you don’t have anyone nearby who has read the book or is interested in discussing it, or maybe you simply want to explore the book on your own terms. In this situation, you might be left wondering how long it will take to fully digest the book and answer your own questions. Without a tutor or friends around to provide guidance and discussion, you may need to take a more thoughtful and introspective approach to your reading.\n\n\nMortimer Adler famously advised in his classic book “How to Read a Book”,\n\n\n  “Reading a book should be a conversation between you and the author.”\n\n\n\n\nFigure. Imagine that we are having a non-judgemental AI tutor to assist in the question and answer to a book. (credit: artwork by Stable Diffusion)\n\nImagine that we are having a non-judgmental AI tutor to assist in the question and answer process can be incredibly helpful, especially when it comes to exploring and applying the ideas presented in a book. An AI can provide unbiased and objective insights into the book’s themes and concepts, and help you to understand the author’s perspective on the subject matter. With an AI’s assistance, you can ask deeper and more meaningful questions, and receive thoughtful and informative responses that can help you to connect the ideas in the book to your own experiences and beliefs. This can lead to a more enriching before and after the reading experience.\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/ask-a-book-questions-with-langchain-openai/Podcast_Ask_a_Book_Questions.mp3\n\n\n\n\n\n\nHow to Build a AI Question and Answering System?\nIn this article, we take the practical approach of building a question and answering system. In the process, we explain how to perform semantic search and query on a book using OpenAI, LangChain, and Pinecone - an external vector store. The book is broken down into smaller documents, and OpenAI embeddings are used to convert them into vectors, which are then stored externally using Pinecone.\n\n\n\nFigure. In this article, we shall walkthrough the process of (1) Extract the Book Content, (2) Split Book into Smaller Chunks, (3) Build Semantic Index and (4) Ask a Book Questions (the red arrows show the questioning flow and the green arrows show the answering flow).\n\nSelected the Book\nWe are using an interesting and free online book: 60 Leaders on Artificial Intelligence, to illustrate the whole process. This is a book in PDF format and contains 236 pages including plenty of graphics. If we can automatically extract the unstructed text and build an index, subsequently to query and summarize from the content.\n\n\n\nFigure. Using “60 Learders on Artificial Intelligence” for the implementation.\n\nThe example demonstrates how to ask a question in natural language and receive an answer using this technique. This approach is not limited to books and can be used for internal documents or external data sets as well. By following the steps outlined, readers will be able to conduct sophisticated searches on large volumes of text, which can assist in answering the questions that we might have after reading the book.\n\nInstallation\n\nUsing our philosophy of learning by doing, we shall take the practical approach to demonstrate how to install all the required Python modules to build a system. The latest LangChain, which has all the goodies of handling many unstructured document formats including PDF and Microsoft Words, requires Python &gt;= 3.8.1. First, we are going to create a virtualenv nlp with python==3.9.\n\nconda create -n nlp python==3.9\nconda activate nlp\n\n\nAfter we activate the nlp virtualenv, we can install LangChain with “all” modules needed for all integrations, run:\npip install -U langchain[all]\n\n\nWe want to add the OpenAI and Pinecone supports,\npip install openai\npip install pinecone-client\n\n\nRunning on Mac Platform Requirements\nWe are using a Mac M1 Pro to run the experiment. There are additional brew packages are required. (obviously, we cannot provide instruction on how to install on Windows).\n\n\n  \n    poppler is a free software utility library for rendering Portable Document Format (PDF) documents.\n    tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and “read” the text embedded in images.\n  \n\n\n# Install other dependencies\n# https://github.com/Unstructured-IO/unstructured/blob/main/docs/source/installing.rst\nbrew install libmagic\nbrew install poppler\nbrew install tesseract\n# If parsing xml / html documents:\nbrew install libxml2\nbrew install libxslt\n\n\nUnstructured File Loader\nThe LangChain Unstructured covers how to load fi..."
  },
  
  {
    "title": "Create Personal Animated AI Avatar",
    "url": "/create-personal-animated-ai-avatar",
    "date": "Feb 22, 2023",
    "categories": ["post"],
    "tags": ["Stable Diffusion","AI Creativity","AI Avatar"],
    "excerpt": "\nCreating your personal AI avatar can be a fun and exciting way to explore the capabilities of cutting-edge AI technologies. An AI avatar is a digital representation of a person that is created usi...",
    "content": "\nCreating your personal AI avatar can be a fun and exciting way to explore the capabilities of cutting-edge AI technologies. An AI avatar is a digital representation of a person that is created using artificial intelligence (AI) tools and techniques. It can take many forms, such as a 3D model, an animated character, or a virtual assistant. AI avatars are becoming increasingly popular in areas such as gaming, entertainment, virtual assistants, and social media.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/create-personal-animated-ai-avatar/Podcast_Personal_AI_Avatar.mp3\n\n\n\n\n\n\nThe potential reasons why we might want to create an AI avatar include:\n\n\n  Gaming: We could create an avatar to represent user in a video game, giving user a more personalized gaming experience.\n  Social media: An AI avatar can serve as a unique and eye-catching profile picture on social media platforms.\n  Virtual assistants: We could create an avatar to serve as a virtual assistant, which could respond to voice commands and help user with tasks such as scheduling appointments or customer supports.\n\n\nIn this article, we will walk through the process of creating a personal AI avatar using bleeding edge AI tools, such as Stable Diffusion, ChatGPT, ElevenLabs, and D-ID. Subsequently, we also looked at the data models and class structures required to create an animated chatbot system, and how user input is processed and classified to generate appropriate responses.\n\n\nFigure. Illustrated the animated AI avatar creation workflow. (1) Generating avatar picture using Stable Diffusion (2) Generating text using ChatGPT (3) Generating a voice using ElevenLabs (4) Combining your avatar picture and generated voice using D-ID.\n\n\n  For example, See my personal AI avatar in action: Animated Personal AI Avatar Demo\n\n\n\n  \n\n\nFigure. Demo of an animated personal AI avatar.\n\nTo learn how to train a personal model for Stable Diffusion, read our previous posts on the topics:\n\n  Stable Diffusion Training for Personal Embedding\n  Dreambooth Training for Personal Embedding\n\n\nStep 1: Generating an avatar picture using Stable Diffusion\n\nStable Diffusion is a state-of-the-art AI tool that can be used to generate high-quality images of faces. To use Stable Diffusion, you need to provide it with some starting parameters, such as gender, age, and hairstyle, and it will generate a realistic image of a face that matches those parameters.\n\nTo generate your personal avatar picture, start by building a personal embedding described in our previous post on Dreambooth Training for Personal Embedding. You can choose any combination of parameters that you like, depending on the kind of avatar that you want to create. Once you have chosen your parameters, run Stable Diffusion to generate your avatar picture.\n\nPrompt: bennycheung person is the token of our personal embedding model\nbennycheung person as a fantasy character, ultra realistic, intricate, elegant,\nhighly detailed, 8k, digital painting, detailed background, trending on artstation,\nsmooth, sharp focus, illustration, in the style of wlop, greg rutkowski\n\n\nNegative Prompt: that’s right, we need to tell a lot to Stable Diffusion NOT to do!\n(((wrinkle))), bread, hat, disfigured, kitsch, ugly, oversaturated, grain,\nlow-res, Deformed, burry, bad anatomy, disfigured, poorly drawn face, mutation,\nmutated, extra limb, ugly, poorly drawn hands, missing limb, blurry, floating limbs,\ndisconnected limbs, malformed hands, bur, out of focus, long neck, long body, ugly,\ndisgusting, poorly drawn, childish, mutilated, mangled, old, surreal, text, blurry,\nb&amp;w, monochrome, conjoined twins, multiple heads, extra legs, extra arms,\nfashion photos (collage:1.25), meme, deformed, elongated, twisted, fingers,\nstrabismus, heterochromia, closed eyes, blurred, watermark, wedding, group\n\n\n\n\nFigure. Use Stable Diffusion (via AUTOMATIC1111 Web UI) to generate the avatar image from a prompt.\n\nStep 2: Generating text using ChatGPT\n\nChatGPT is a powerful natural language processing (NLP) tool that can be used to generate text based on a given prompt. To use ChatGPT, you need to provide it with a prompt, and it will generate a piece of text that follows from that prompt.\n\nTo generate the script for your AI avatar, start by choosing a prompt that will help you to create the kind of dialogue that you want your avatar to have. For example, you might choose a prompt like “Introduce yourself” or “Tell me about your interests.” Once you have chosen your prompt, run ChatGPT to generate the script.\n\n\n\nFigure. Use ChatGTP to write a script for the avatar speech.\n\nStep 3: Generating a voice using ElevenLabs\n\nElevenLabs is a cutting-edge AI tool that can be used to generate realistic voices based on a given text. To use ElevenLabs, we need to provide it with a text script, and it will generate a voice that speaks the text ..."
  },
  
  {
    "title": "Dreambooth Training for Personal Embedding",
    "url": "/dreambooth-training-for-personal-embedding",
    "date": "Nov 11, 2022",
    "categories": ["post"],
    "tags": ["Dreambooth","Stable Diffusion","AI Creativity","Python"],
    "excerpt": "\nThis article will focus on training an embedding that is deeper and is able to go farther than the original Stable Diffusion software that we described in the previous post - Stable Diffusion Trai...",
    "content": "\nThis article will focus on training an embedding that is deeper and is able to go farther than the original Stable Diffusion software that we described in the previous post - Stable Diffusion Training for Personal Embedding, which already has a solid embedding to use to generate art. We want to train it to be even more customized, in a way that goes through larger and more complex contexts.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/dreambooth-training-for-personal-embedding/Podcast_Dreambooth_Training_Personal.mp3\n\n\n\n\n\n\nTraining this more generalized checkpoint model will be via the use of the Dreambooth’s hypernetwork training technique. This will combine the personal image and embedding inside of it, which leads to new artwork that will be much higher quality and accurate than previous art that is just created from a basic Stable Diffusion embedding model. This is the next step and will improve the fidelity and accuracy of this art and the AI’s understanding of the given prompts.\n\n\n\nFigure. All of the creative imaginations of personal embedding into the toy and materials! From top-left, action figure, potato head, plush doll, simpson, play-doh, lego minifigure, garfield cat, disney, low-poly, funko-pop, pixar, plastic, video game, dog, yarn doll, lego blocks.\n\nAfter learning how to use Dreambooth training a personal embedding, we shall also explore the topics on AI creativity and learn how to improve the generated images by img2img’s inpainting.\n\nUsing Dreambooth\nDreambooth and Stable Diffusion are capable of producing great works of art. The key differences are that Dreambooth is more targeted toward users who want to create images that look like a specific person, whereas Stable Diffusion is a more general image generation.\n\n  With Stable Diffusion, the artist creates a prompt and then runs that prompt through the AI system to see what images are generated. This system can produce more realistic images but it might take longer to generate something that the user liked.\n  With Dreambooth, the artist trains an AI system to create a particular artist’s or art style. The images produced are more predictable and quicker to generate.\n\n\nDreambooth is able to generate more precise results but can only generate specific individuals. Stable Diffusion can generate a variety of images, but the results may be less precise. For general use, Stable Diffusion is a better choice - although for precise individual use, Dreambooth is a superior choice.\n\n\n\nFigure. By input a set of personal images, Dreambooth training will produce a Stable Diffusion hypernetwork model that can be used in various context. See the original paper by Nataniel Ruiz, et. al., DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation\n\nFollowing “Aitrepreneur” video on YouTube, detailing how to use Dreambooth training a custom Stable Diffusion model on RunPod.io or Google’s Colab environment, has explained the process step by step, and even includes the necessary code references in his video description. We see the trained custom model performing so well that motivate us to try out the technique ourselves!\n\nHow to Find the Training Hardware?\nOne of our biggest obstacles in training with the Dreambooth algorithm is the hardware requirement. This algorithm requires a GPU with a minimum of 24GB of VRAM. While the NVidia RTX 3090 is a great option, if you can get your hands on one, the price for a new one is a whopping $1,500 CAD. We are looking for a cheaper alternative, such as a cloud service that can offer that level of VRAM for cheaper.\n\nAfter searching for suitable servers with the needed hardware, we resolved to use RunPod.io. It offers the option to rent a secure cloud-based machine which includes an RTX A5000 with 24GB of VRAM. This rental costs only $0.49 USD an hour, much more affordable than buying the NVidia RTX 3090!\n\n\n\n\n  **Important Note!!! When starting the Pod, make sure there is more disk spaces are allocated. In our case, change from 20 to 40 GB for both container disk and volume disk spaces; otherwise, we may risk running out of disk space.\n\n\n\n\nWhich Dreambooth-Stable-Diffusion?\nThere are hundreds of forks on the Google’s original Dreambooth-Stable-Diffusion. However, “Aitrepreneur” video points out that the Joe Penna branch of Dreambooth-Stable-Diffusion contains special jupyter notebooks designed to help training your personal embedding. It has the notebook designed to run on Google Colab or RunPod.io. This allows us to train an hypernetwork model that will work with Stable Diffusion. The goal is to create a checkpoint model that can be used as a personal profile images based on text prompts.\n\nTraining Process\nNow for the meat of the training details, training a hypernetwork is the process to generate a personalized embedding for the Stable Diffusion software; ..."
  },
  
  {
    "title": "Stable Diffusion Training for Personal Embedding",
    "url": "/stable-diffusion-training-for-embeddings",
    "date": "Nov 02, 2022",
    "categories": ["post"],
    "tags": ["AI","Stable Diffusion","Textural Inversion","Python"],
    "excerpt": "\nWe previously described the Neural Style Transfer and Deep Dream, which were among the first popular application of the AI technology on artistic works 5 years ago, but quickly made way for a more...",
    "content": "\nWe previously described the Neural Style Transfer and Deep Dream, which were among the first popular application of the AI technology on artistic works 5 years ago, but quickly made way for a more powerful and capable model named Textual Inversion. Stable Diffusion is a free tool using textual inversion technique for creating artwork using AI. The tool provides users with access to a large library of art generated by an AI model that was trained the huge set of images from ImageNet and the LAION dataset. The resulting Diffusion Models are able to create stunningly realistic images in a variety of mediums, including digital paintings, photo-based art, comics, and even animations. The tool is incredibly easy to use and can create images in seconds. We can see in Lexica site many example prompts that generating beautiful artworks by Stable Diffusion.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/stable-diffusion-training-for-embeddings/Podcast_Stable_Diffusion_Personal.mp3\n\n\n\n\n\n\nAfter successfully training a personal embedding, Stable Diffusion’s AI creativity is nearly infinite. The algorithms that power Stable Diffusion are two neural networks that work in tandem. They are called an autoencoder and a generative adversarial network, which are a type of neural network that are designed to learn patterns in data. The architecture is fed billions of image and text prompts over thousands of computer hours to teach it how to process language and understand images. After training, it can associate certain words or phrases with certain visual elements. For example, it could learn when you type the word “forest”, it should generate an image containing trees and leaves.\n\nThe sample prompt to generate most of the following portraits is\n\"[realbenny-t2 | panda] happy front portrait as pixar, disney character\n illustration by artgerm, tooth wu, studio ghibli, sharp focus, artstation\".\n\n\n\n\nFigure. Examples of Stable Diffusion AI generated portraits using the trained personal embedding with the given input prompt. We have many controls in Stable Diffusion to instruct the direction of the AI creativity.\n\nIf you are interested in learning how to use Stable Diffusion to generate personal profile images from text prompts, after reading this article, you will be able to train a personal embeddings model for Stability Diffusion AI!\n\nInstalling Stable Diffusion\nThe official Stable Diffusion repository named AUTOMATIC1111 provides step by step instructions for installing on Linux, Windows, and Mac. We won’t go through those here, but we will leave some tips if you decide to install on a Mac with an M1 Pro chip. If you are not using M1 Pro, you can safely skip this section.\n\nInstallation on Mac M1 Pro\n\n  If Xcode is not fully installed. Run this to complete the install:\n  xcodebuild -runFirstLaunch\n  \n\n\nWhile the web UI runs fine, there are still certain issues when running this fork on Apple Silicon.  All samplers  seems to be working now except for “DPM fast” (which returns random noise), and DDIM and PLMS (both of which fail immediately with the following error: “AssertionError: Torch not compiled with CUDA enabled”).\n\nInstall HomeBrew\nFirst, you need to install the required dependencies using Homebrew.\n\nxcode-select --install\n\n/usr/bin/ruby -e \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)\"\necho 'eval \"$(/opt/homebrew/bin/brew shellenv)\"' &gt;&gt; /Users/benny.cheung/.bash_profile\neval \"$(/opt/homebrew/bin/brew shellenv)\"\nbrew -v\n\n\nInstall Rosetta 2\nThe magic behind translating intel_64 bits code to arm64 automatically!\n\nsudo softwareupdate --install-rosetta --agree-to-license\n\n\nInstall Stable Diffusion Requirements\nThe script can be downloaded from here, or follow the instructions below.\n\nbrew install cmake protobuf rust python git wget\n\n\nThe script can be downloaded from here, or follow the instructions below.\n\n\n  Open Terminal.app\n  Run the following commands:\n\n\n$ cd ~/Documents/\n$ curl https://raw.githubusercontent.com/dylancl/stable-diffusion-webui-mps/master/setup_mac.sh -o setup_mac.sh\n$ chmod +x setup_mac.sh\n$ ./setup_mac.sh\n\n\nAfter installation, you’ll now find run_webui_mac.sh in the stable-diffusion-webui directory. Run this script to start the web UI using ./run_webui_mac.sh. This script automatically activates the conda environment, pulls the latest changes from the repository, and starts the web UI. On exit, the conda environment is deactivated.\n\nThe post brew installation notes:\n\nAfter the installation, we don’t want PostgreSQL and Redis always start when Mac rebooted, do this to remove the service\n\nbrew services stop postgresql@14\nbrew services stop redis\n\n\nHow Stable Diffusion Works?\n\nStable Diffusion is a text to image generation model where you can enter a text prompt like,\nhalf (realbenny-t1 | yoda) person, star war,\nart by artgerm and g..."
  },
  
  {
    "title": "FireSQL in Python",
    "url": "/firesql-in-python",
    "date": "Apr 29, 2022",
    "categories": ["post"],
    "tags": ["Python","Parsing","Firestore","SQL","FireSQL"],
    "excerpt": "\nPyFireSQL is a SQL-like programming interface to query Cloud Firestore collections using Python. Cloud Firestore is a NoSQL, document-oriented database. Unlike a SQL database, there are no tables ...",
    "content": "\nPyFireSQL is a SQL-like programming interface to query Cloud Firestore collections using Python. Cloud Firestore is a NoSQL, document-oriented database. Unlike a SQL database, there are no tables or rows. Instead, you store data in documents, which are organized into collections.\n\n\nThere is no formal query language to Cloud Firestore - NoSQL collection/document structure. For many instances, we need to use the useful but clunky Firestore UI to navigate, scroll and filter through the endless records. With the UI, we have no way to extract the found documents. Even though we attempted to extract and update by writing a unique program for the specific task, we felt many scripts are almost the same that something must be done to limit the endless program writing. What if we can use SQL-like statements to perform the data extraction, which is both formal and reusable? - This idea will be the motivation for the FireSQL language!\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/firesql-in-python/Podcast_FireSQL_In_Python.mp3\n\n\n\n\n\n\n\n\nEven though we see no relational data model of (table, row, column), we can easily see the equivalent between table -&gt; collection,  row -&gt; document and column -&gt; field in the Firestore data model. The SQL-like statement can be transformed accordingly.\n\nThere is a similar project FireSQL, which is written in Typescript and pegis parser generator to use SQL-like language to interface Firestore. We felt strongly that the combination of Python and lark parser generator is more appropriate for backend processing and analysis toolchain. In particular, the extracted data can be directly imported into downstream Pandas’s Dataframe post-processing will be an extremely valuable prospect.\n\nFireSQL Statements\nThe set of implemented SQL-like DML (Data Manipulation Language) statements are,\n\n\n  \n    \n      FireSQL Statement\n      Description\n    \n  \n  \n    \n      SELECT\n      select documents from a collection\n    \n    \n      INSERT\n      insert new document in a collection\n    \n    \n      UPDATE\n      modify the existing documents in a collection\n    \n    \n      DELETE\n      delete existing documents in a collection\n    \n  \n\n\nIn this article, we shall focus on the FireSQL’s SELECT statement. For the interested reader to know the details,\nplease read in the corresponding PyFireSQL Documentation @readthedocs.\n\nFireSQL Parser Explained\nThe FireSQL parser consists of two parts: the lexical scanner and the grammar rule module. Python parser generator Lark is used to provide the lexical scanner and grammar rule to parse the FireSQL statement. In the end, the parser execution generates the parse tree, aka. AST (Abstract Syntax Tree). The complexity of the FireSQL syntax requires an equally complex structure that efficiently stores the information needed for executing every possible FireSQL statement.\n\nFor example, the AST parse tree for the FireSQL statement\nSELECT id, date, email\n  FROM Bookings\n  WHERE date = '2022-04-04T00:00:00'\n\n\n\n\nFigure. Illustration of the parse tree generated by lark\n\nThis is delightful to use lark due to its design philosophy, which clearly separate the grammar specification from processing. The processing is applied to the parse tree by the Visitor or Transformer components.\n\nVisitor and Transformer\nVisitors and Transformer provide a convenient interface to process the parse-trees that Lark returns. lark documentation defines,\n\n\n  Visitors - visit each node of the tree, and run the appropriate method on it according to the node’s data. They work bottom-up, starting with the leaves and ending at the root of the tree.\n  Transformers -  work bottom-up (or depth-first), starting with visiting the leaves and working their way up until ending at the root of the tree.\n    \n      For each node visited, the transformer will call the appropriate method (callbacks), according to the node’s data, and use the returned value to replace the node, thereby creating a new tree structure.\n      Transformers can be used to implement map &amp; reduce patterns. Because nodes are reduced from leaf to root, at any point the callbacks may assume the children have already been transformed.\n    \n  \n\n\n\n  Using Visitor is simple at first, but you need to know exactly what you’re fetching, the children chain can be difficult to navigate depending on the grammar which produce the parsed tree.\n\n\nWe decided to use Transformer to transform the parse tree to the corresponding SQL component objects that can be easily consumed by the subsequent processing.\n\nFor instance, the former example parse tree is transformed into SQL components as,\n\nSQL_Select(\n  columns=[SQL_ColumnRef(table=None, column='id'),\n           SQL_ColumnRef(table=None, column='date'),\n           SQL_ColumnRef(table=None, column='email')],\n  froms=[SQL_SelectFrom(part='Bookings', alias=None)],\n  where=S..."
  },
  
  {
    "title": "Game Architecture for Card Game AI (Part 3)",
    "url": "/game-architecture-card-ai-3",
    "date": "Jul 03, 2021",
    "categories": ["post"],
    "tags": ["Game Architecture","Card Game","Python","AI"],
    "excerpt": "\nThe last article on the topics of “Game Architecture for Card Game” series will focus on the amazing “Race for the Galaxy” AI. Even though Keldon Jones released his RFTG AI source code back in 200...",
    "content": "\nThe last article on the topics of “Game Architecture for Card Game” series will focus on the amazing “Race for the Galaxy” AI. Even though Keldon Jones released his RFTG AI source code back in 2009 [Jones09], it was using neural networks and reinforcement learning to train the game AI, way before DeepMind’s Alpha Go success that drew the world’s attention to reinforcement learning.\n\n\nAt the heart of the card game architecture is the AI that keeps the human player engaged and provides the endless replayability of a game. We must provide the AI agent (either a single or a group of players) the ability to make intelligent decisions according to the game rules and logic. The AI action will be the movement that affects the future game states. The reward of the decision is simple, “winning” the target game.\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/game-architecture-card-ai-3/Podcast_Card_Game_AI_Part_3.mp3\n\n\n\n\n\n\n\nFigure. Game Architecture Overview - The components are grouped according to their functional roles in the system. The functional roles are (1) Game Story and Game Asset, (2) Game Model, (3) Game Engine, (4) Game Interface, (5) Game AI, (6) Game Physics (only for physics based game), and (7) Hardware Abstraction.  When studying any game source code, this architecture will help to classify their functional roles\n\nTo train a Reinforcement Learning (RL) agent how to play a card game intelligently,\na full-fledged game environment has been put in place and needs to capture all the mechanics and rules so that the agent can interact with the game like a real human player would do. The hardest part is to develop the game that can play intelligently against itself. With that thought, researches using RL in card games are gaining a lot of popularity, such as the card games of Easy21 [Amrouche20], UNO [Pfann21]. The RLCard Toolkit [Daochen19], goes even one step further by providing a general RL framework for researching card games. In this article, we shall continue focus on “Race for the Galaxy” card game.\n\nHowever, the RFTG card game’s rule complexity is difficult to express in a well-defined game state representation, unlike Easy21 or UNO simplicity. We must raise the level of abstraction in order to describe RFTG AI. We are hoping to guide readers in the right direction with the following outline.\n\nRFTG Python Development\nInterest reader can find the full development set up instruction, Python source code and Jupyter notebook experiments described in this article from [Cheung21].\n\nJupyter Notebook Experiments (Part 3)\nThe development experiments on (Part 3) are recorded in the Jupyter Notebook rftg_ai.ipynb to quickly run the code samples.\nInside Visual Studio code, install the Microsoft’s “Jupyter” extension. When activate the rftg_ai.ipynb inside VScode, change the Python kernel to use rftg that has been setup from the code README.md instructions.\n\n\n\nInstallation for AI Notebook\nThe following Python modules are required to run this AI notebook rftg_ai.ipynb.\n\npip install keras==2.4.3\npip install tensorflow==2.5.0\npip install pydot\n\n\nIn addition, we need to define the .keras/keras.json to use the tensorflow backend.\n\n {\n  \"image_data_format\": \"channels_last\",\n  \"epsilon\": 1e-07,\n  \"floatx\": \"float32\",\n  \"backend\": \"tensorflow\"\n}\n\n\nTemporal Difference - Reinforcement Learning\nCollecting the information provide by both Keldon’s post and Temple Gate’s blog post, we learned that RFTG AI follows Tesauro’s TD-Gammon ideas using Temporal Difference (TD) neural network. And yes, neural network and reinforcement learning techniques are applied to games since the ’90s.\n\nOne of the biggest attraction of reinforcement learning, it does not require a pre-defined model (aka. model-free) and no human input needed to generate the training data. The neural network learns by repeatedly playing with itself, for instance, RFTG AI was trained iteratively over 30,000 simulated games to find the weights for neural network nodes.\nThe innovative idea of this learning algorithm consists of updating the weights in its neural net after each turn to reduce the difference between its evaluation of previous turns’ board positions, its evaluation of the present turn’s board position—hence “temporal-difference learning”.\n\nDuring network training, AI examines on each turn a set of possible legal moves and all their possible responses (via the optimal policy), feeds each resulting game state into its evaluation function, and chooses the action that leads to the player got the highest score. TD’s innovation was in how it learned its evaluation function incrementally using reinforcement learning.\n\nAfter each turn, the learning algorithm updates each weight in the neural net according to the following rule:\n\n\\[w_{t+1} - w_t = \\alpha(Y_{t+1} - Y_t)\\sum_{k=1}^{t}\\lambda^{t-k} \\nabla_w Y_k\\]\n\nwhere:\n\n\n  $w_..."
  },
  
  {
    "title": "Game Architecture for Card Game Action (Part 2)",
    "url": "/game-architecture-card-ai-2",
    "date": "Jun 28, 2021",
    "categories": ["post"],
    "tags": ["Game Architecture","Card Game","Python","AI"],
    "excerpt": "\nContinue from the previous Game Architecture for Card Game Model (Part 1), we defined a game architecture as a reference to study the “Race for the Galaxy” card game. This article focus on the com...",
    "content": "\nContinue from the previous Game Architecture for Card Game Model (Part 1), we defined a game architecture as a reference to study the “Race for the Galaxy” card game. This article focus on the components of (3) Game Engine and (4) Game Interface.\n\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/game-architecture-card-ai-2/Podcast_Card_Game_AI_Part_2.mp3\n\n\n\n\n\n\n\n\nFigure. Game Architecture Overview - The components are grouped according to their functional roles in the system. The functional roles are (1) Game Story and Game Asset, (2) Game Model, (3) Game Engine, (4) Game Interface, (5) Game AI, (6) Game Physics (only for physics based game), and (7) Hardware Abstraction.  When studying any game source code, this architecture will help to classify their functional roles\n\n(3) Game Engine - the rules and rendering of a game. The game states and operations are projected on a display. All legal operations are checked and animated on-screen.\n\n(4) Game Interface - the management system of a game. The game preference and setup are an integral part of running a game. The interface allows the player(s) to select optional elements of the game. All legal operations are presented and interacted with the player(s) according to the game rules.\n\nIn part 2, we shall continue to describe the game architecture using Race for the Galaxy (RFTG) for our study. As usual, a balance between theory and practice, we set up the Python development to illustrate the object-oriented game engine.\n\nRFTG Python Development\nInterest reader can find the full development set up instruction, Python source code and Jupyter notebook experiments described in this article from [Cheung21].\n\nJupyter Notebook Experiments (Part 2)\nThe development experiments on (Part 2) are recorded in the Jupyter Notebook rftg_game.ipynb to quickly run the code samples.\nInside Visual Studio code, install the Microsoft’s “Jupyter” extension. When activate the rftg_game.ipynb inside VScode, change the Python kernel to use rftg that has been setup from the code README.md instructions.\n\n\n\nRFTG Game Engine\nThe Game Engine defines a set of rules that players must follow. It also maintains the game states, by using the game information system that tracks the players and the game progress.  Extracting from the game data model, the following UML diagram highlights the classes that represent the game engine components.\n\nGame is a session that follows a definite set of rules, a set of players that interact with the game model. At the highest level, a game is a composition of a Deck of cards, a set of Player and the associated GameResource. The resources include the card design library as Library and the card display as CardDisplay.\n\n\n  Game - Game class keeps the record of global states of a game.\n  Deck - Deck class represents the cards in a game.\n  Player - Player class keeps the record of an actor action states within a game.\n  GameResource - GameResource class keeps the global resources required by a game, such as the Library and CardDisplay.\n\n\n\n\nThe players are actors either as Human or Computer (AI). Each player must make decisions concerning about game process and rules. The Decision is the abstract class that can be implemented as interactive UIDecision that a human is called for any game actions, or implemented as automated AIDecision that a computer is called to act.\n\nCreate a Game\nTo create a game session, the initialization steps are shown as following:\n\n  create Library and load the card design and images\n  create CardDisplay to provide convenient card images display\n  create Deck of cards that reference to the Library\n  create a set of Player who can be human or computer AI\n  create GameResource is composed of Library and Display, potentially extend with more game resources, such as network communication and data storage.\n  Finally, Game is composed of GameResource, Deck and a list of Player.\n\n\nLater, we shall discuss how the Player delegate the game control to Decision.\n\nfrom rftg.cards import Library, Deck, Card\nfrom rftg.display import Display, CardDisplay\nfrom rftg.game import GameResource, Game, Player\n\n# create the game resources\nlibrary = Library()\nlibrary.read_cards('cards.txt')\nlibrary.read_card_images('card_images')\nlibrary.load_actions('card_images')\nprint('Designs: {}'.format(len(library.designs)))\n\ndisplay = Display('card_images', figsize=(16,8))\ncard_display = CardDisplay(library, display)\n\n# build the game deck\ndeck = Deck(library)\ndeck.build_deck(0)\nprint('Cards: {}'.format(len(deck.cards)))\n\n# create the players\nplayer1 = Player(name=\"Blue\", ai=False)\nplayer2 = Player(name=\"Red\", ai=False)\nplayers = [player1, player2]\n\n# create a new game with players\nrresource = GameResource(library=library, display=display)\ngame = Game(resource=resource, session_id='testing', deck=deck, players=players)\n\n\nThe Game can also take ..."
  },
  
  {
    "title": "Game Architecture for Card Game Model (Part 1)",
    "url": "/game-architecture-card-ai-1",
    "date": "Jun 27, 2021",
    "categories": ["post"],
    "tags": ["Game Architecture","Card Game","Python","AI"],
    "excerpt": "\nBeing software architects, we always interest to know how a software system is built.\nAt the same time, if one is a gamer, you would meditate on how a game is being designed and constructed;\nespec...",
    "content": "\nBeing software architects, we always interest to know how a software system is built.\nAt the same time, if one is a gamer, you would meditate on how a game is being designed and constructed;\nespecially, when you are toasted by the card game’s AI that makes you “angry”.\nSuch an emotional response is pushing forward positively, becomes the driving force for a personal month-long investigation into studying the game architecture and how to construct a game. Consequently, I can study the smart AI that gets me kicked.\n\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/game-architecture-card-ai-1/Podcast_Card_Game_AI_Part_1.mp3\n\n\n\n\n\n\nCard Game - Race for the Galaxy\nThe card game in focus is Race for the Galaxy (RFTG). For reader convenience, I quoted from the official game description here:\n\n\n  In the card game Race for the Galaxy, players build galactic civilizations by playing game cards in front of them that represent worlds or technical and social developments. Some worlds allow players to produce goods, which can be consumed later to gain either card draws or victory points when the appropriate technologies are available to them. These are mainly provided by the developments and worlds that are not able to produce, but the fancier production worlds also give these bonuses.\n\n\n\n\nFigure. Tom Lehmann’s Race for the Galaxy has made an incredible impact on the board game world since its 2007 release by Rio Grande Games, spawning six expansions and supporting up to 5 players. This has led to the recent Race for the Galaxy app by Theresa Duringer and Temple Gates Games, available on iOS/Andriod/Steam with high rating. (credit: the image shows the app version of the game where the player can play online or play against one of the best card game AI.)\n\nContinue to quote from the game play:\n\n\n  At the beginning of each round, players each select, secretly and simultaneously, one of the seven roles which correspond to the phases in which the round progresses. By selecting a role, players activate that phase for this round, giving each player the opportunity to perform that phase’s action. For example, if one player chooses the settle role, each player has the opportunity to settle one of the planets from their hand. The player who has chosen the role, however, gets a bonus that applies only to them. But bonuses may also be acquired through developments, so you must be aware when another player also takes advantage of your choice of role.\n\n\nGame AI and Source Code\nWe are fortunate that Keldon Jones, who is the AI developer of the card game, described in his post how the game AI is being designed. Even though this is back in 2009, it was using neural networks and reinforcement learning to train the game AI. The game AI source code, written in C, is released under the GNU General Public License, version 2 (GPLv2). This is an excellent opportunity to learn how the game and the game’s AI are developed. In the process, the game is rewritten in Python for my better understanding so that I can run and visualize experiments with the AI code more conveniently.\n\nOutline\nThis article starts with the general game architecture to identify the important components of a game. The architecture will provide a layout of how to read a game source code. Then we shall show how to analyze and design Keldon’s RFTG C code, such that we can rewrite the game components in an object-oriented Python code.\n\nIn part 1, we shall lay out the groundwork by describing a game architecture. Since the architecture components are numerous, this article will focus only on the RFTG’s (1) Game Model and (2) Game Assets. As always, a balance between theory and practice, we set up the Python development to illustrate the object-oriented conversion process of the game. These are the necessary groundwork to support the game engine. The game AI will need to wait for later articles.\n\nGame Architecture\nA game is an information system that keeping track of the states in a game universe, such that the player(s), both human or AI, can interact with the game through a series of legal actions according to the game rules. The following is a succulent definition of a computer game that is memorable,\n\n\n  Computer Game is a simulator of the subject of interest. The Game Engine is just a real-time database with a pretty front end and definite rules.\n\n\n\nFigure. Game Architecture in the Nutshell\n\nThis is a great start for studying game architecture. The definition layouts the foundational components that a computer game engine must design and implement. This definition is just to set the stage for a more detailed description next.\n\nGame Architecture Overview\nAlthough the success of a game is not determined by the architecture alone - gameplay does, I cannot emphasize enough that the reasons to have a good architecture. Most importantly, the archit..."
  },
  
  {
    "title": "Synthesis of Neural to Symbolic Knowledge for NLP System",
    "url": "/synthesis-neural-symbolic-knowledge-nlp",
    "date": "Sep 13, 2020",
    "categories": ["post"],
    "tags": ["AI","Deep Neural Network","Neural-Symbolic","NLP","Prolog","Python"],
    "excerpt": "\nMuch of human knowledge is collected in the written language. Extracting knowledge directly from the textual form of natural language has been one of the lofty goals of Natural Language Processing...",
    "content": "\nMuch of human knowledge is collected in the written language. Extracting knowledge directly from the textual form of natural language has been one of the lofty goals of Natural Language Processing (NLP) since the beginning of AI research. The recent advance in NLP, using deep neural networks, has effectively automated the parsing and understanding of the natural language. The NLP using deep neural networks is successful because of the DNN adaptive learning ability to handle\nreal-world data when the processing is not readily describable in the traditional symbolic rules.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/synthesis-neural-symbolic-knowledge-nlp/Podcast_Neural_Symbolic_AI.mp3\n\n\n\n\n\n\nOur past exploration on “Deep Learning on Text Data” has demonstrated the basics of NLP using DNN.\nIn this article, we shall take another leap into the next stage of NLP. After the NLP accurately parsed the text, a dependency graph can be computed.\nTo improve human-level understanding, we can use the traditional symbolic reasoning.\n\nThe combination of symbolic AI and emerging NLP tools that recently evolved from deep neural network researches start to mature.\nWe believe that this high-level symbolic reasoning and low-level statistical learning are complementary according to AI experts [Launchbury17].\nBy working them together, they will take significant forward steps in natural language understanding.\nSubsequently, humans can use the symbolic explanation to understand the AI model’s reasoning and to improve the human-machine interactions.\n\n\nFigure. Neural to Symbolic NLP system architecture shows the synergies between low-level NLP and high-level symbolic processor. By encoding the low-level parsed text into symbolic representations, human interaction can be improved by the traceable questions and answers in symbolic reasoning.\n\nIn the Neural to Symbolic NLP system architecture diagram, the future of NLP is componentized into the Natural Language Processor which has the dedicated responsibility of parsing the text accurately and flexibly with deep neural networks. The text collection is transformed into facts and rules such that the Symbolic Processor can apply high-level reasoning to the Knowledge Base of the transformed facts and rules. Furthermore, tapping into the wealth of Data Mining and Knowledge-Based Management System (KBMS), we understand how to make a large number of facts usable in AI reasoning tasks. If the human level of symbolic facts is fed into the rule-based system, the reasoning engine can search either backward-chaining or forward-chaining through a set of domain-specific rules [GiarratanoRiley04]. The most important business value is to ensure that the reasoning steps are traceable and explainable based on the original truthful observation from the human context.\n\nThis may look like another AI pipe dream but we shall take a practical engineering step using DeepRank [TarauBlanco20] to demonstrate the possibility. We start with the synthesis of neural to symbolic knowledge from a simple document to perform question and answer; subsequently, we shall synthesize knowledge from a complex HIPAA regulations document to illustrate the greater system capabilities.\n\nSynthesis of Neural to Symbolic Knowledge\nKnowledge engineering is the process of creating both facts and rules that apply to data to imitate the way a human thinks and approaches problems. A task and its solution are broken down into their structure and based on that information, the reasoning engine determines how the solution was reached. Traditionally, the knowledge engineering process requires a domain expert working with a knowledge engineer to manually encode the domain facts and rules into the knowledge base. The process is usually very expensive and slow.\n\nLuckily, in the domain of text documents, we can rely on the implicit textual data to state the facts about its content, i.e. the content is the domain expert by itself. All we need to do is to find the best knowledge representation to state the collection of implicit facts. The rules are more tricky; however, we can always start with the generic textual understanding by the language grammar and the statement context.\n\nWe recognize the limitations of the current proposed techniques [CS224N].\n\n\n  We still have primitive methods for building and accessing memories or knowledge.\n  Current models have almost nothing for developing and executing goals and plans.\n  We still have quite inadequate abilities for understanding and using inter-sentential relationship.\n  We still can’t, at a large scale, do elaborations from a situation using common sense knowledge.\n\n\nThe former list is not impossible to tackle but we are not covering them here.\n\nUsing DeepRank\nWe are utilizing the recent research by DeepRank [TarauBlanco20]. The system uses a Python-based text ..."
  },
  
  {
    "title": "Solving Puzzles using Constraint Logic Programming in Prolog",
    "url": "/solving-puzzles-using-clp",
    "date": "Sep 03, 2020",
    "categories": ["post"],
    "tags": ["AI","Prolog","Logic","Puzzle"],
    "excerpt": "\nSince the last article on “Using Prolog to Solve Logic Puzzles” 4 years ago,\nI finally woke up and discovered how to use the amazing clp(fd) - Constraint Logic Programming (Finite Domain) module.\n...",
    "content": "\nSince the last article on “Using Prolog to Solve Logic Puzzles” 4 years ago,\nI finally woke up and discovered how to use the amazing clp(fd) - Constraint Logic Programming (Finite Domain) module.\nVarious implementation of clp(fd) existed in different Prolog dialects but the concepts are essentially shared.\nTo illustrate how clp(fd) is a perfect fit for many combinatorics problems, we shall explore by using SWI Prolog implementation of clp(fd) to solve a few types of logic puzzle.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/solving-puzzles-using-clp/Podcast_Solving_Puzzle_Constraint_Logic.mp3\n\n\n\n\n\n\nConstraint logic programming is naturally fit into the paradigm for logic languages like Prolog,\nin which “relations between variables are stated in the form of constraints.” For instance, the following expressions:\n\n\n  X + Y (where X and Y are unconstrained)\n  X + Y &gt; 0 (where X and Y constraint that the formula has to be greater than 0, that condition must be met to resolve X and Y)\n\n\nThe term “Finite Domain” is just a fancy way of saying all numbers \\(\\in \\mathbb{Z}\\).\nIf we can transform a puzzle into Integers, clp(fd) can be applied to transform the puzzle into an efficient combinatorics search problem.\n\nSince clp(fd) is not loaded when SWI Prolog started, you can simply import the module by,\n\n:- use_module(library(clpfd).\n\n\nAs a learning exercise, we have converted 3 types of commonly known logic puzzles: Cryptarithmetic Puzzle, Logic Puzzle, and Spatial Logic Puzzle. These can be elegantly and efficiently solved with Prolog and constraints. The first 2 types of puzzles can be very directly modelled and solved as combinatorial tasks. The third type needs more effort to find a suitable formulation as such tasks. After encoding all of these puzzles as integers, Prolog constraints can search over the different states efficiently.\n\nCryptarithmetic Puzzle - Summation Problem\nThe classical SEND + MORE = MONEY cryptarithmetic puzzle constrained the assignment of letters between the digits 0 thru 9. They spell out “SEND MORE MONEY” and when read as base 10 numbers create a true mathematical formula. An additional constraint is the leading letter is not permitted to be zero.\n\nHere is another similar cryptarithmetic puzzle of FORTY + TEN + TEN = SIXTY that we shall solve here.\n\n\n\nUsing clp(fd), the solution is naturally expressed in the specified constraints [CLPFD-TUTOR],\n\n:- use_module(library(clpfd)).\n\npuzzle_sixty([F,O,R,T,Y] + [T,E,N] + [T,E,N] = [S,I,X,T,Y]) :-\n        Vars = [F,O,R,T,Y,E,N,S,I,X],\n        Vars ins 0..9,\n        all_different(Vars),\n        F*10000 + O*1000 + R*100 + T*10 + Y +\n        2*(T*100 + E*10 + N) #=\n        S*10000 + I*1000 + X*100 + T*10 + Y,\n        F #\\= 0, T #\\= 0, S #\\= 0,\n        label(Vars).\n\n\nStarting the SWI Prolog and import the puzzles.pl file,\n\n$ swipl\n\n?- [puzzles].\ntrue.\n\n?- puzzle_sixty(X).\nX =  ([2, 9, 7, 8, 6]+[8, 5, 0]+[8, 5, 0]=[3, 1, 4, 8, 6]) ;\nfalse.\n\n\nThere are rooms for improvement, for instance, since the letter position defines the digit multiplier of 10s,\nfor example, “FORTY” is expressed as,\n\nF*10000 + O*1000 + R*100 + T*10 + Y\n\n\nThis is not ideal and error prone to write.\nWe can generalize by defining the relation between a list of digits and the represented number:\n\ndigits_number(Ds, N) :-\n        length(Ds, _),\n        Ds ins 0..9,\n        reverse(Ds, RDs),\n        foldl(pow, RDs, 0-0, N-_).\n\npow(D, N0-I0, N-I) :-\n        N #= N0 + D*10^I0,\n        I #= I0 + 1.\n\n\nThen, we can convert to use the digits_number/2 relation to make the program read more elegantly,\n\n:- use_module(library(clpfd)).\n\npuzzle_sixty_new([F,O,R,T,Y] + [T,E,N] + [T,E,N] = [S,I,X,T,Y]) :-\n        Vars = [F,O,R,T,Y,E,N,S,I,X],\n        Vars ins 0..9,\n        all_different(Vars),\n        digits_number([F,O,R,T,Y], FORTY),\n        digits_number([T,E,N], TEN),\n        digits_number([S,I,X,T,Y], SIXTY),\n        FORTY + 2 * TEN #= SIXTY,\n        F #\\= 0, T #\\= 0, S #\\= 0,\n        label(Vars).\n\n\nAs expected, the solution should be the same as before.\n\n?- puzzle_sixty_new(X).\nX =  ([2, 9, 7, 8, 6]+[8, 5, 0]+[8, 5, 0]=[3, 1, 4, 8, 6]) .\n\n\nLogic Puzzle - Revisit Zebra Puzzle\n\nWe can also revisit the Zebra Puzzle using the constructs of constraint programming.\nJust for convenience, we shall repeat the puzzle and then we can provide a solution by using clp(fd) [CLPFD-PUZZLE].\n\nThe Zebra Puzzle comes with 15 facts and 2 questions:\nWho has a zebra and who drinks water?\n\n\n\nThe list of facts (or constraints):\n\n\n  There are 5 colored houses in a row, each having an owner, which has an animal, a favorite cigarette, a favorite drink.\n  The English lives in the red house.\n  The Spanish has a dog.\n  They drink coffee in the green house.\n  The Ukrainian drinks tea.\n  The green house is next to the white house.\n  The Winston smoker has a serpent.\n  In the yellow hous..."
  },
  
  {
    "title": "Dempster-Shafer Theory for Classification using Python",
    "url": "/dempster-shafer-theory-for-classification",
    "date": "Aug 21, 2020",
    "categories": ["post"],
    "tags": ["Dempster-Shafer Theory","Machine Learning","Classification","Python"],
    "excerpt": "\nMachine Learning is dominated by ANN (Automated Neural Network), it requires a large training data set of labelled data to learn a classification model.\nWhen only a small data set is available, th...",
    "content": "\nMachine Learning is dominated by ANN (Automated Neural Network), it requires a large training data set of labelled data to learn a classification model.\nWhen only a small data set is available, the decision tree &amp; its variant random forests dominated the classification.\nIn this article, we shall explore the Dempster-Shafer Theory as the theoretical basis for classifiers on a small data set, where classification is operated on the principle of combining pieces of evidence.\n\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/dempster-shafer-theory-for-classification/Podcast_Dempster_Shafer_Theory_for_Classification.mp3\n\n\n\n\n\n\n\n\nFigure. Images of Iris spieces. We shall attempt to use the petal width, petal height, sepal width, sepal height measurements to perform classification for the classes of Iris Setosa, Iris Versicolour, or Iris Virginica in this article.\n\nDempster-Shafer Theory [DST][GS76][GS90] is a mathematical theory of evidence,\noffers an alternative to traditional probabilistic theory for the mathematical representation of uncertainty.\nThe significant innovation of this framework is that it allows for the allocation of a probability mass to sets or intervals\nas opposed to mutually exclusive singletons. In contrast, Bayesian inference requires some a priori knowledge and is unable to assign a probability to ignorance. D-S is a potentially valuable tool for the evaluation of risk and reliability in engineering applications when it is not possible to obtain a precise measurement from experiments, or when knowledge is obtained from expert elicitation.\nAn important aspect of D-S theory is the combination of evidence obtained from multiple sources and the modelling of conflict between them.\n\nThe motivation for selecting D-S theory can be characterized by the following reasons [SF02]:\n\n\n  The relatively high degree of theoretical development among the nontraditional theories for characterizing uncertainty.\n  The relation of Dempster-Shafer theory to traditional probability theory and set theory.\n  The large number of examples of applications of Dempster-Shafer theory in engineering in the past.\n  The versatility of the Dempster-Shafer theory to represent and combine different types of evidence obtained from multiple sources.\n\n\nUsing our philosophy of learning by doing, we shall take the practical approach to demonstrate how to use Python pyds Dempster-Shafer module.\nStarting with some experiment with Dempster-Shafer belief functions, then we shall progress into classification on the “Iris Plant Dataset” [IPD] using D-S theory. The result is achieving 96% accuracy, which is comparable to other ML models.\n\nPython Installation\n\nVirtual Environment\nUsing an isolated Python virtual environment will protect you from headaches and disaster of installations.\ndst (or your choice of name) is the name of the virtual environment, and python=3.5 is the Python version.\n\nconda create -n dst python=3.5\n\n\nPress y to proceed. This will install the Python version and all the associated anaconda packaged libraries at `{path_to_anaconda_location}/envs/crypto\n\nThen activate the virtual environment by,\n\nsource activate dst\n\n\npyds Module\npyds is a Python library for performing calculations in the Dempster-Shafer theory of evidence. This is the best and most comprehensive in the following aspects,\n\n  Support for normalized as well as unnormalized belief functions\n  Different Monte-Carlo algorithms for combining belief functions\n  Various methods related to the generalized Bayesian theorem\n  Measures of uncertainty\n  Methods for constructing belief functions from data\n\n\nEnsure that you are in the virtualenv,\n\npip install pyds\n\n\nor it is as easy to install from the downloaded source https://github.com/reineking/pyds\n\ncd pyds; python setup.py install\n\n\nOther Modules\nThe other modules are required to run our experiments,\n\npip install numpy\npip install pandas\npip install matplotlib\npip install seaborn\n\n\nThese modules are important but not described in details because of this article focus on D-S theory. We shall just use them as processing and displaying tools.\n\nDempster-Shafer Evidence Theory\nWhen an expert made an observation on some evidence, he could say,\n\n  Expert: “I’m fairly sure that the evidence was either \\(a\\) or \\(b\\). Probably \\(a\\), though it could have been \\(b\\). I could be wrong though.”\n\n\nThe statement about “Probably \\(a\\) or \\(b\\)” can be translated into\n\\(A = \\{a, b\\}\\), one abstains from making any statement about whether \\(a\\) is more probable than \\(b\\).\nInstead, such an assignment expresses complete ignorance about this question.\nThere are several equivalent representations for quantifying belief within the D-S\nbelief function framework. The four most important representations are [RT14],\n\n\n\nFigure. Illustration of different belief function representations. The frame of disc..."
  },
  
  {
    "title": "Adventures in Deep Reinforcement Learning using StarCraft II",
    "url": "/adventures-in-deep-reinforcement-learning",
    "date": "Nov 15, 2019",
    "categories": ["post"],
    "tags": ["Reinforcement Learning","StarCraft II","AlphaStar","AI"],
    "excerpt": "\nThe paradigm of learning by trial-and-error, exclusively from rewards is known as Reinforcement Learning (RL). The essence of RL is learning through interaction, mimicking the human way of learnin...",
    "content": "\nThe paradigm of learning by trial-and-error, exclusively from rewards is known as Reinforcement Learning (RL). The essence of RL is learning through interaction, mimicking the human way of learning with an interaction with environment and has its roots in behaviourist psychology. The positive rewards will reinforce the behaviour that leads to it.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/adventures-in-deep-reinforcement-learning/Podcast_Deep_Reinforcement_Learning_StarCraft_II.mp3\n\n\n\n\n\n\nFor a definition of the reinforcement learning problem we need to define an environment in which a perception-action-learning loop takes place. In this environment, the agent observes a given state t. The agent, learning in the policy, interacts with the environment by taking an action in a given state that may have long term consequences. It goes into a next state with a given timestep t+1 and updates the policy. At the end, the agent receives observations/states from the environment and a reward as a sign of feedback, and interacts with the environment through the actions.\n\nThe reinforcement learning problem can be described formally as a Markov Decision Process (MDP): it describes an environment for reinforcement learning, the surroundings or conditions in which the agent learns or operates. The Markov process is a sequence of states with the Markov property, which claims that the future is independent of the past given the present. The sufficiency of the last state makes that we only need the last state to evaluate the agent future choices. While deep neural network requires a lot of supervised training data, and inflexible about the modelled world changes. On the other hand, reinforcement learning can handle the world changes and maximize the current selection.\n\nUsing PySC2 helps to understand the practical aspect of reinforcement learning, rather than starting with toy example, the complexity of StarCraft II game is more realistic, the AI needs to balance resources, building, exploring, strategizing and fighting. The balance of multiple objectives and long term planning in order to win, makes the game felt realistic in complexity. The current techniques are mostly focus on single agent learning. The potential is to extend into multi-agents learning that applying collaborative game theory (Do you remember the movie “A Beautiful Mind”?)\n\n\n\nFigure. This shows the running state of the StarCraft II Learning Environment. The top left shows the actual StarCraft II game running. The SC2LE captures and reports the observations from the game environment. SC2LE allows a visual display of all the game observations on the right. The AI agent can take the observations and evaluates the optimal actions.\n\nGames are ideal environments for reinforcement learning research. RL problems on real-time strategy (RTS) games are far more difficult than problems on Go due to complexity of states, diversity of actions, and long time horizon. The following is my practical research notes that capture this learning and doing process. This article is intended to provide a concise experimental roadmap to follow. Each section starts with a list of reference resources and then follows with what can be tried. Some information is excrept from the original sources for reader convenience, in particular being able to learn how to setup and running the experiements. As always, ability to use Python is fundamental to the adventures.\n\nPySC2 Installation\n\n  Ref: StarCraft II Learning Environment, https://github.com/deepmind/pysc2\n  Ref: StarCraft II Client - protocol definitions used to communicate with StarCraft II https://github.com/Blizzard/s2client-proto\n\n\nPySC2 is DeepMind’s Python component of the StarCraft II Learning Environment (SC2LE). It exposes Blizzard Entertainment’s StarCraft II Machine Learning API as a Python RL Environment. This is collaboration between DeepMind and Blizzard to develop StarCraft II into a rich environment for RL research. PySC2 provides an interface for RL agents to interact with StarCraft II, getting observations and sending actions.\n\nInstall by,\n\nconda create -n pysc2 python=3.5 anaconda\nconda activate pysc2\n\npip install pysc2==1.2\n\n\nYou can run an agent to test the environment. The UI shows you the actions of the agent and is helpful for debugging and visualization purposes.\n\npython -m pysc2.bin.agent --map Simple64\n\n\nThere is a human agent interface that is mainly used for debugging, but it can also be used to play the game. The UI is fairly simple and incomplete, but it’s enough to understand the basics of the game. Also, it runs on Linux.\n\npython -m pysc2.bin.play --map Simple64\n\n\nRunning an agent and playing as a human save a replay by default. You can watch that replay by running:\n\npython -m pysc2.bin.play --replay &lt;path-to-replay&gt;\n\n\nThis works for any replay as long as t..."
  },
  
  {
    "title": "Attribute-based Encryption for Healthcare Blockchain",
    "url": "/attribute-based-encryption-for-healthcare-blockchain",
    "date": "Apr 28, 2019",
    "categories": ["post"],
    "tags": ["Cryptography","Attribute-based Encryption","Healthcare","Blockchain","FHIR"],
    "excerpt": "\nIt’s no surprise that one of the greatest concerns for a healthcare provider is data security. One can argue that the data ownership is the most important asset in this information age. Healthcare...",
    "content": "\nIt’s no surprise that one of the greatest concerns for a healthcare provider is data security. One can argue that the data ownership is the most important asset in this information age. Healthcare patients are aware of the value of their personal information.\nAs a result, a healthcare provider required to switch from a centralized ownership to a distributed ownership. The patient now has the full control of his health data while the provider only has permission to access to the data when necessary.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/attribute-based-encryption-for-healthcare-blockchain/Podcast_Blockchain_ABE.mp3\n\n\n\n\n\n\nProvided that a section of the private data is obscured before usage, a patient could be incentivized to share part of their health related data for the advancement of medical research.\nIn order to make this decentralized data ownership and sharing work, it requires a new technique to protect the data privacy and provide a secure system.\n\n\n\nFigure. Artistic style of Kandinsky transfer to a healthcare blockchain image\n\nOur previous article on “Why Blockchain for Healthcare?” explained how to use blockchain to share data with the permissioned parties.\nIn this article, we’ll expand on the discussion of healthcare data sharing,\nwith a focus on the data privacy through Attribute-based Encryption.\nWhen a patient decided to share part of his/her health data, the data is recorded into a blockchain confidentially. Only the permissioned parties can decrypt the shared data.\n\nAs we explore the concept and practice of attribute-based encryption for healthcare blockchain,\nwe’ll go through the following topics from a system perspective, simplifying the mathematical aspects.\nOf course, for the mathematically inclined readers, the original papers are listed in the References section for a deeper understanding.\n\nPersonal Health Record (PHR) Security\nTo start our journey into this new sharing convention,\nFast Healthcare Interoperability Resources (FHIR) is the industry standard\nfor fast and efficient storage/retrieval of health data.\nUnits of health data in FHIR are referred to as “Resources”. FHIR de\ffines multiple such resources, such as Patient, Practitioner, Observation, MedicalRequest, etc. where each resource can be linked to multiple other resources.\nIn the previous post on FHIR Server Up and Running, we have provided a practical tutorial to setup a FHIR RESTful API service with the simulated patient health data. As always, our philosophy of balance between theory and practice is always helpful when we need to comprehend a new concept.\n\nWhile FHIR enabled health data to be published easily,\ninadvertent or malicious disclosure of data that contains Personally Identi\fable Information (PII)\nto unauthorized individuals or organizations may have catastrophic consequences.\nThus, healthcare providers must comply with federal and state policies when they release sensitive medical data.\nFor example, the compliance policies, such HIPAA and HITECH, must be carefully studied and enforced for auditing.\nEnsure data security is a definite first step towards compliance.\n\nAttribute-based Encryption (ABE)\nAttribute-based Encryption (ABE) is a relatively recent approach\nthat reconsiders the concept of public-key cryptography\n[SW05][GPSW06][BSW07][Waters11].\nIn traditional public-key cryptography, a message is encrypted for a specific receiver using the receiver’s public-key. Identity-based cryptography and in particular identity-based encryption (IBE) changed the traditional understanding of public-key cryptography by allowing the public-key to be an arbitrary string, e.g., the email address of the receiver.\n\nABE goes one step further and defines the identity not atomic but as a set of attributes, e.g., roles and context, and messages can be encrypted with respect to subsets of attributes (key-policy ABE - KP-ABE) or policies defined over a set of attributes (ciphertext-policy ABE - CP-ABE). The key issue is, that someone should only be able to decrypt a ciphertext if the person holds a key for “matching attributes” where user keys are always issued by some trusted party.\n\nIdentity and Attributes\nAt the time of registering a patient or a practitioner, attributes can be specified for them,\nwhich then are added to their X.509 certificates upon their enrollment to a Certificate Authority (CA).\nExamples of attributes include a role name such as an “Patient” or “Practitioner”\nthat is agreed upon by the organizations participating in the network.\nWhen smart contract is executed,\nit can extract from the identity and attributes before the invoke or query transaction.\nAt a simple level, this also allows application-level attributes to be passed down\ninto smart contract through a X.509 certificate. The participant’s identity and attributes are fully secured and authenticated, i.e. no o..."
  },
  
  {
    "title": "Geospatial Granular Computing",
    "url": "/geospatial-granular-computing",
    "date": "Dec 20, 2018",
    "categories": ["post"],
    "tags": ["Geospatial","Granular Computing","Spatial Granule","Geocoding"],
    "excerpt": "\nGranular Computing can be conceived\nas a framework of theories, methodologies, techniques, and tools\nthat make use of information granules in the process of problem solving. In particular,\nthe gra...",
    "content": "\nGranular Computing can be conceived\nas a framework of theories, methodologies, techniques, and tools\nthat make use of information granules in the process of problem solving. In particular,\nthe granular computing has been implicitly used in geospatial representation and processing\nin order to conquer the complexity and diversity of a large geographic data set.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/geospatial-granular-computing/Podcast_Geospatial_Granular_Computing.mp3\n\n\n\n\n\n\nStarting with the questions on “what is granular computing?” and “why is that important for geospatial problem solving?”,\nwe shall step back from the long history of specific problem solving, to re-imagine a theory and to re-structure\nthe spatial constructs. Using the ideas from granular computing, we shall re-discover a solution to geospatial problem.\nFor concrete illustration of the theory, an important geospatial problem - Geocoding is chosen.\n\nGeocoding is the process of taking an address or named location\nand returning a longitude and latitude. With a georeferenced longitude and latitude,\nmany geospatial analysis and processing are made possible, such as mapping, routing, proximity searches, etc. \nSince our focus is on “georeferenced” spatial parts, we shall assume spatial is meant geospatial from this point.\nAdditionally, the theory formulated in this article, can be extended to other geospatial analytic problems.\n\n\n\nFigure. Geocoding is the process of converting addresses (like a street address) into geographic coordinates (like latitude and longitude), as if the dropped red-pin is the result of the geocoded address.\n\nExplaining the theory of geospatial granular computing, we shall explore the following topics.\n\nAs a side note, another unification effort is the theory based on Geospatial Semantic Web [ZhZhLi15].\nGeospatial semantic web is using RDF representation of spatial data and GeoSPARQL as query language.\nThe semantic representation can provide infrastructure to a knowledge engine to support\npowerful reasoning and information retrieving from heterogenous data sources.\nHowever, the downside of representing structured geospatial data in these languages\ncan result in inefficient data access, which is the main reason that we stay away from this approach.\n\nThinking in Granular Computing?\nIn a philosophical sense, granular computing has been foundational to human ability to analyze the complex world,\nby abstracting into a connected and hierarchical levels of granularity.\nTo help reasoning with a specific interest, we come to switch among different granularities.\nFor instance, if we are planning for a long distance trip, we would first plan in the macro country-level travel.\nThen we assume in the context of destination country,\nwe come to switch to micro city-level, and then street-level granularity to reach a point of interest.\nBy focusing on different levels of granularity, the relevant levels of knowledge can enhance understanding of\nthe inherent knowledge structures. The structural system thinking is thus essential skill in human problem solving\nand hence has a very significant aspect of intelligence, in this case spatial reasoning.\n\nFrom the book, A Unified Framework of Granular Computing (chapter 17 of [PeSk08]),\nYiyu Yao presented the granular computing offers these three unified perspectives across problem domains.\n\n\n  Philosophical perspective, offers a new world view that leads to structured thinking.\n  Methodological perspective, deals with structured problem solving.\n  Computational perspective, concerns structured information processing.\n\n\n\n\nFigure. The three perspectives of Granular Computing forming a conceptual framework to unify cross problem domain principles. (image credit: Yiyu Yao)\n\nMost importantly, Yao argued the important advantage of converging different domains\ninto a share conceptual framework using granular computing, can extract high-level commonalities of different domains\nand synthesize their results into an integrated whole by ignoring the particular low-level details.\nSubsequently, the shared concepts make explicit where the ideas hidden in domain-specific discussions\nin order to arrive at a set of domain-independent principles.\n\nBy overlaying the granular computing concepts onto the geospatial domain,\nthe former integration of the three perspectives on system thinking,\nresults in a holistic understanding of geospatial processing that emphasizes structures embedded in a web of granules.\n\nSpatial Granule\nA complex spatial problem consists of interconnected and interacting spatial parts.\nEach spatial part can consist of other parts.\nFor each spatial part or a group of parts may be considered as a spatial granule.\nThese spatial granules will become a primitive notion of geospatial granular computing.\nThe physical meaning of spatial granules may be made clea..."
  },
  
  {
    "title": "Preparing Geospatial Data in PostGIS",
    "url": "/preparing-geospatial-data-in-postgis",
    "date": "Dec 11, 2018",
    "categories": ["post"],
    "tags": ["Docker","PostGIS","Geospatial","OpenStreetMap"],
    "excerpt": "\nThe Spatial is a popular extension to the traditional database systems.\nWhen the data has some spatial attributes, for example a street address or a phone number,\nwe can use the spatial proximity ...",
    "content": "\nThe Spatial is a popular extension to the traditional database systems.\nWhen the data has some spatial attributes, for example a street address or a phone number,\nwe can use the spatial proximity to analyze and calculate their relationships in space.\nWe are usually designating a finite geometric representation and operation on these spatial entities.\nIf you are interested to use the public domain spatial data to solve your analysis problem,\nread on for a gentle introduction to geospatial data and how to load and use them in PostGIS.\n\n\nThe World as Unified One?\nLet’s begin with a curious map projection,\nDymaxion Map is a projection of a world map onto the surface of an icosahedron,\nwhich can be unfolded and flattened to two dimensions. The flat map is heavily interrupted in order to preserve shapes and sizes.\n\n\n\nFigure. The Dymaxion map projection can be folded into a Platonic solid of Icosahedron (image credit: Buckminster Fuller Institute )\n\nWith this projection,\nit shows an almost contiguous land mass comprising all of earth’s continents – not groups of continents divided by oceans.\nThis version depicts the earth as “one island” is claiming to help unifying humanity better than other methods\nof how the map is projected. Another interesting result of using this projection to support analysis,\nthe Early Human Migrations according to mitochondrial population genetics can be explained (see the picture).\nFrom this projected perspective, the migration pattern is almost a natural deduction from the spatial arrangement.\nWe may not be able to immediately solve many humanity issues with Dymaxion map.\nBut positively, the spatial perspective does help us to see.\n\nOutlines\nIn this article, you can expect to begin with the concepts of geospatial data.\nAs always the concepts are supplimented by practical experimentations.\nUsing Docker to quickly spin up a PostGIS instance, you can load the OpenStreetMap (OSM) data into the database. \nSubsequently, using QGIS to connect with this PostGIS instance, the spatial data can be analyzed and visualized interactively.\nBy the end, you are equipped with a firm perspective of how to prepare geospatial data for the future geospatial analysis.\n\n\n1. What is Geospatial?\nThe definition of Geospatial consisted of 2 closely related concepts,\nGeo as in geographic information and Spatial as in representation of objects arranged in space.\nThe relation of a set of spatial coordinates that is reference to a concrete place on earth,\nwe shall say these spatial data are georeferenced; hence, the term geospatial is combined.\nThe best known geospatial application, GIS (Geographic Information System)\nis recognized as a system of hardware and software used for the input,\nstorage, retrieval, mapping, display and analysis of geospatial data.\n\nGeospatial data come in layers, and each layer has two types of information:\ngraphic and attributes. The former is also called geometric data, and the latter thematic data.\nInformation in the layers can be represented in two forms: raster and vector.\n\n\n\nFigure. A typical example of map layers - composition of coastlines, borders and rivers\n\n1.1. Representations\nThe raster and vector representations are two different geospatial conceptions, used to model the real world.\nVector data are graphics, commonly represented by three geometric types: points, lines and polygons.\nAs conformed to geospatial, these graphic objects are situated in georeferenced space according to their coordinates.\nIn the case of raster data, the layer (or space) is considered a grid\nwhere each cell (also called a pixel) represents a basic element of information.\nThe raster representation assumed the space exists beforehand, and the object is placed in it.\nThis dichotomy between vector and raster representation left us with two different\nschools of thought concerning space: in the vector representation,\nspace exists because of the objects, and without the object there is no space;\nin the raster representation, space is an intuitive idea where objects are placed.\n\nVector data is composed of points, lines, and polygons.\nIn a vector dataset, each point represents a value at a specific (X,Y) and (optionally) Z point in space.\nVector data is best suited for representing discrete features:\ne.g., the point-of-interests represented by points, the roads represented by lines, and the city boundaries by polygons.\n\nIn contrast, raster data is composed of pixels: small, uniformly-sized, grid cells.\n\n\n\nFigure. The raster representation of an area showing the color (red, green, blue) values of each cell (pixel). What does the color values mean requires intepretation.\n\nIn a raster dataset, each pixel has a value. Pixels representing equivalent data have the same value:\nRasters are well-suited for representing continuous data across a broad area: for example, elevation data or temperature measurements.\nRaster pixels may also be used to represent color values: satellite imagery is an example of this kind of da..."
  },
  
  {
    "title": "Existential Crisis with Microservices using Docker",
    "url": "/existential-crisis-with-microservices-using-docker",
    "date": "Oct 08, 2018",
    "categories": ["post"],
    "tags": ["Microservices","Docker","Python","Flask"],
    "excerpt": "\nAs software architects, we have no doubt that both Microservices architecture and Docker deployment help to bring\nflexibility and scalability in system design. However, the blurring existence of a...",
    "content": "\nAs software architects, we have no doubt that both Microservices architecture and Docker deployment help to bring\nflexibility and scalability in system design. However, the blurring existence of a system identity, due to\ndistributed services and virtualized machines, has lead to the question of what makes the identity of\nthis new form of design intention. \nThis article points to one of the design intension is rapid development of data-oriented services.\nAfter understanding the basic of Docker deployment, this article offers a Starter Framework using Flask/PostgreSQL/Docker\ncan be used to rapidly construct any data-oriented service with a database backend.\nFrom a larger background scheme, this article prepares the reader for a greater architectural composition of the future system.\n\n\n\n\nFigure. Is a personal Identity (1) a body (2) a brain or (3) an invisible soul? Comparing to a software system (1) machine (2) software or (3) intention to raise the system’s existential question.\n\nIn this article, you can expect to learn:\n\n\n  Start with Philosophical Musing on System Existence\n  Follow by Understanding Docker\n  Then, construct a Rapid Microservices using Docker\n    \n      Learn how to build the Microservices stack with Docker Orchestration\n      Learn how to build Persistent Volumes with Docker - Data-only Container Pattern\n      Learn how to create Postgres DB Image\n      Learn how to create API starter-api Image\n      Learn how to use the API image to import Sample (Medical Procedure Codes) into DB\n    \n  \n  Finally, conclude with What’s Next\n\n\n\nPhilosophical Musing on System Existence\nThe expression “this system is my baby”, is a common analogy that a software system creator ascribe to his invention.\nThe emotional attachment is the creator’s design and his creative spirit are manifested in the system.\nImagine if the software is the brain of the system, the body is the physical machine to run the software.\nThe soul of the system, where the creator’s desire and intention are encoded, is displayed by the style of functionalities.\nThe first experience of running a Docker container, with it’s virtual and temporal existence,\nstrangely invokes the philosophical amusment on “existence”.\n\nLet’s run the Docker https://www.docker.com/ command,\n\n$ docker run ubuntu echo “hello world”\n\n\nThe output displays on the terminal,\n\nhello world\n\n\nThe machine responsible to say “hello world” has gone through,\n\n\n  Docker pull down an image of the lastest ubuntu os image from Docker registry\n  Docker start an instance of ubuntu image as a running container - the virtual machine\n  Docker execute the bash shell command echo \"hello world\"\n  Docker display the terminal output “hello world” to the host machine\n  Docker exit the virtual machine and terminated\n\n\nThe most tenderizing thought is the virtual machine just existed for a brief moment to execute the command,\nand then it disappears back into the void. Both the machine body and the software brain are transient,\nonly the soul of the echo program remains to show the trace of saying “hello world” to us. \nHence, the existential question - Is your software system really existed in the distributed and virtual world?\nor only the soul of the software system existed?\n\nLet’s first seek to understand what Docker can offer, then we are prepared to answer the question of the\nnew design form of rapid development.\n\n\nUnderstanding Docker\nIn this section we shall define the basic Docker terminologies and learn how exactly a Docker image works.\n\n\n  \n    Docker image -\nA Docker image is like a golden template. An image consists of OS (Ubuntu, centos etc.,)\nand applications installed on it. These images are called base images. A Docker base\nimage is the building block of a Docker container from where a container can be created.\nAn image can be built from scratch using Docker inbuilt tools. You can also use Docker\nimages created by other users from Docker public registry (Docker hub) as a base image\nfor your containers.\n  \n  \n    Docker registry - \nDocker registry is a repository for Docker images. It can be public or private. The public\nDocker registry maintained by Docker is called Docker hub. Users can upload and\ndownload images from the Docker registry. The public Docker registry has a vast\ncollection of official and user created images. To create a container, you can either use the\npublic images created by another user or you can use your own images by uploading it to\nthe public or private registry.\n  \n  \n    Docker container -\nA container is more of a directory and an execution environment for applications. It is\ncreated on top of a Docker image and it is completely isolated. Each container has its own\nuser space, networking and security settings associated with it. A container holds all the\nnecessary files and configurations for running an application. A container can be created,\nrun, started, moved and deleted\n  \n\n\n\n\nFigure. An illustration of all Docker terminologies and how they are wor..."
  },
  
  {
    "title": "Deep Learning on Text Data",
    "url": "/deep-learning-on-text-data",
    "date": "Jul 01, 2018",
    "categories": ["post"],
    "tags": ["NLP","Deep Learning","Sentiment Analysis"],
    "excerpt": "\nLarge quantity of human communication is composed in the form of text written in natural language. The recent advance in the field of Machine Learning confirms that meaningful knowledge can be ext...",
    "content": "\nLarge quantity of human communication is composed in the form of text written in natural language. The recent advance in the field of Machine Learning confirms that meaningful knowledge can be extracted effectively. Once the general techniques of natural language processing in combination of machine learning, a wide-range of practical enterprise application can be imagined.\n\n\n\n\n  [2024/10/07] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/deep-learning-on-text-data/Podcast_Deep_Learning_on_Text.mp3\n\n\n\n\n\n\n\n\nFigure. Text data projected onto Van Gogh’s Starry Night painting, as an analogy to the dream of finding patterns out of deceptive chaos.\n\nWhat we want to learn is how to apply the machine learning technique - Deep Learning to text data.\nNot surprisingly, the business has abundance of text data,\nwhich is usually in the form of unstructured text, that could be emails, comments, or documents that is chaotic in it’s original form.\nFor Machine Learning, finding patterns out of this deceptive chaos automatically is the engineering dream of AI.\nTo tackle this natural language chaos, ML developer has the Natural Language Processing (NLP) techniques to discover patterns.\nIn this article, we shall start with the theoretical knowledge of NLP applied to ML,\nthen we shall practice the techniques on the twitter’s sentiment analysis problem,\nand apply Deep Learning to model and predict the tweets sentiment.\n\nSentiment Analysis with Machine Learning\n\nSentiment Analysis is a pretty interesting problem in the NLP space.\nWhenever there is an email coming into the customer service inbox,\nthe business wants to be able to identify the customer’s sentiment, and in the case that the customer’s sentiment is negative, we want to send it to the customer service folder for attention; otherwise we want to send it to the happy feedback inbox.\nOne way to solve this problem is to handcraft a complex set of rules and then actually implement\na program that will check whether these rules are satisfied. When an email arrived,\nit would go through an algorithm that will check for the static rules and the output would be a label\nthat says whether this email’s is positive or negative sentiment.\n\nThe static rules could be, for example like a check, which looks for specific keywords being present\nin the email or not. If those keywords were present, then it would send it to the customer service folder.\nPossibly, this kind of approach is perfectly alright with some business.\nBut given the complexity of natural language and the types of problems,\neven human expert may find that difficult to express in rules.\nIn such case, if the business happen to have a large amount of historical emails,\nbasically a large body of text already available.\nIn addition, if the patterns or relationships that the business are trying to model\ndynamically and continuously changing,\nthen using a machine learning approach will be a better option than a rule based approach.\n\nWithin a machine learning approach, we would design part of the system where an email is coming in,\nit goes through a trained model which is checking for the email sentiments,\nand the model will return an output which says positive or negative.\nThe only difference here is that the rules, which are used to check whether an email is positive or negative\nsentiment, are dependent on historical emails.\nSo the ML algorithm would basically look at historical emails to derive the rules\nand it would keep updating the rules as the business accumulates more and more historical emails.\nThe historical emails, which have already been marked as positive and negative sentiment,\nembedded within them a lot of information which indicate what makes an email as positive or negative sentiment.\nThe ML algorithm is able to look at the historical emails, and infer what those rules are by itself,\nand use those rules to check against any new email; subsequently, classify it into positive or negative sentiment correctly.\n\nTypes of Machine Learning in NLP\n\nThere are two approaches that we could take while solving any NLP problem.\nThe rule based approach would involve the programmer either knowing what the rules are\nor empirically identifying the rules by using data analytics and exploration.\nIn the machine learning approach, the rules will be identified by an ML algorithm.\nBut there is a workflow that needs to be set up first.\n\nMachine learning problems generally fall under a specific set of categories.\nYou would start by identifying which type of problem or\nwhich category the problem you are trying to solve falls into.\nOnce the problem’s category has been identified,\nthe data need to be transformed and represented by using numeric attributes.\nThen, we would apply a standard ML algorithm on those numeric data.\n\nAs previously mentioned, machine learning problems generally fall under a broad set of categories.\nThese..."
  },
  
  {
    "title": "Interactive Hex World Map using D3",
    "url": "/interactive-hex-world-map-using-d3",
    "date": "Jun 14, 2018",
    "categories": ["post"],
    "tags": ["D3","Hexagon","Tessellation","Mapping"],
    "excerpt": "\nHow does the nature inspires us about the optimal geometry?\nThe bees create their honeycombs with precision engineering, an array of prism-shaped cells with a perfectly hexagonal cross-section.\nIf...",
    "content": "\nHow does the nature inspires us about the optimal geometry?\nThe bees create their honeycombs with precision engineering, an array of prism-shaped cells with a perfectly hexagonal cross-section.\nIf you blow a layer of bubbles on the surface of water, they instantly rearrange into three-wall junctions with more or less equal angles of 120 degrees between them. You will never find a raft of square bubbles.\nThere is great explanation that the law of nature, which automatically optimized in a self-organized system, which approximate to hexagons, read more from Why Nature Prefers Hexagons (2016).\n\n\n\n\nI remembered the pleasure of reading The Computational Beauty of Nature.\nThe topics in the book taught me an important lesson, development can be divided into immediate needs for professional life\nand spiritual food for enriching life. Spatial Tessellation belongs to the spiritual side of my computing adventure,\nbecause spatial representation and reasoning are my on-going research thesis.\nI sense the pure fun of the spatial computing and try to bring my understanding closer to nature.\n\nThis article will take on the following investigation path,\n\n\n  Understanding the hexagonal coordinate system\n  Creating the hexagonal grid to cover the world\n  Aggregating the world terrian into the hexagonal grid\n  Projecting the hexagonal grid onto a globe\n  Implementing the hexagonal grid and the globe for visualization in D3\n\n\nYou will see a working interactive demo in this page; subsequently you can checkout the demo code yourselves to play.\n\nCovering Earth with Hexagonal Map Tiles\nOn the practical side of tessellation, many strategy games have good reason to choose the hexagonal tiles as their game board.\nOne of the main advantages is that the distance between the center of any tile and all its neighboring tiles is the same.\nIn term of strategy, this equidistance property enables consistent strategic moves (see the right diagram below, move to all connected cells is 1 (green arrow))\nWhen comparing with the rectangular grid,\nthe diagonal move is unfair because it takes 1 step but the equivalent orthogonal move takes 2 steps\n(see the left diagram of rectangular grid, the diagonal move (red arrow) is equal taking 2 orthogonal moves (green arrows)).\n\n\n\nI was wondering if anyone has any thoughts on marrying a hexagonal tile system\nwith the traditional geographic system (longitude/latitude). I think it would be\ninteresting to cover a globe with hexagonal tiles and be able to map a\ngeographic coordinate to a tile. This is a topic belongs to The Promise of Discrete Global Grid Systems, which is beyond my speculation.\n\nHexagon Coordinate System\nTo understand Hexagon Coordinate System and it’s mathematics,\nthe excellent website that comes to mind is Amit’s game programming information and\nhis collection of interactive explanation on Hexagon Grids.\nThis article will simply use the results explained by Amit.\n\nUnlike the familiar Cartesian Coordinate System that we learned since our high school days, which has only 1 type of grid layout.\nHexagon Coordinate System comes with 4 types of grid layout! For this article’s world map,\nI choose to use the “odd-r” horizontal layout, which shoves odd rows right.\n\n\n\n\n  You are free to choose other hex layout type but the mathematics is slightly different. See later, the hexlib.js implementation supposed to support all 4 layout types.\n\n\nHexagon Binning\nD3 comes with a Hexagon Binning package, useful for aggregating data into\na coarser representation for display. Rather than rendering a scatterplot of tens of thousands of points,\nbin the points into a few hundred hexagons to show the data. The hexbin is chosen to use color encoding\nof the average height of the world’s terrain height image (the smaller black &amp; white picture).\n\n\n\nFor all the terrain pixels that fall within the same hex bin, the average pixel value is calculated and mapped to a color,\nwhere the gradient map from red (is low) and blue (is high) altitude.\n\nHexagonal Grid Generation and Projection\nIn order to support the interactivity, the three Javascript files are combined to map the screen coordinate to the corresponding hex grid coordinate. Also, the location of the hex point is sent to display on a rotating globe.\n\n&lt;script src=\"js/hexmap/hexlib.js\"&gt;&lt;/script&gt;\n&lt;script src=\"js/hexmap/hexlib_ui.js\"&gt;&lt;/script&gt;\n&lt;script src=\"js/hexmap/hexglobe.js\"&gt;&lt;/script&gt;\n\n\nUnfortunately, the 2D hex cylindrical map projecting onto a 3D spherical map is not perfect (no matter what you do, the projection will be distorted), as you can see towards the north/south poles, the area of the hex cells are distorted. May be the interested reader can enhance the code?\n\nThe full demo source code can be checkout from the companion D3HexMap github repo.\nMore explanation will come after the demonstration.\n\nDemo Interactive Visualization\nTry to move your cursor over the hex grid to view the globe rotated to center (the red dot) on the..."
  },
  
  {
    "title": "YOLO for Real-Time Food Detection",
    "url": "/yolo-for-real-time-food-detection",
    "date": "Jun 07, 2018",
    "categories": ["post"],
    "tags": ["Deep Neural Network","Object Detection","YOLO"],
    "excerpt": "\nThe obsession of recognizing snacks and foods has been a fun theme for experimenting the latest machine learning techniques. The highest goal will be a computer vision system that can do real-time...",
    "content": "\nThe obsession of recognizing snacks and foods has been a fun theme for experimenting the latest machine learning techniques. The highest goal will be a computer vision system that can do real-time common foods classification and localization, which an IoT device can be deployed at the AI edge for many food applications.\n\n\nThe Snack Watcher in the previous post Snack Watcher using Raspberry Pi 3,\nwhich is using the classical machine learning techniques on the extracted image features, the recognition results are far from impressive. Due to the difficulty of hand-crafted features are affected by background objects, lightings, object position in space and object category variations. In order to reduce the error rate, the environment is required to be fine-tuned; subsequently, the environment assumptions become unrealistic that cannot be deployed in real-life settings.\n\nWith the latest improvement on Convolutional Neural Network (CNN), the image classification accuracy has been leaps and bounce in recents years (since 2014). In many instances, AI can recognize objects better than human expert. As a food detection’s technologist, the Deep Learning method is the future of food watching.\n\nThe usual difficulty with the Deep Learning is the requirement of a large dataset. Instead of investing great labor to collect the required food images,\nI have located the Food100 dataset UEC FOOD 100 (from Food Recognition Research Group at The University of Electro-Communications, Japan) contains 100-classes of food photos. Each food photo has a bounding box indicating the location of the food item in the photo. This is a perfect dataset to replace Snack Watcher to the all-encompassing Food Watcher.\n\n\n\nEven though most of the food classes in this dataset are popular foods in Japan,\nwith Toronto’s international food culture, we don’t have a hard time to recognize most of the classes, such as “green salad”, “hot dog” and “hamburger”. The training result should still be interesting in our western culture.\n\nHere is the result of YOLO Real-Time Food Detection on a 720p video stream, running on a Nvidia GTX TitanX, is ~70 fps!\n\n\n  \n\n\nContinue reading this article to understand, setup and train a custom YOLO Neural Network to achieve this result.\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/yolo-for-real-time-food-detection/Podcast_YOLO_Food_Detection.mp3\n\n\n\n\n\n\nYOLO Real-Time Object Detection\nBefore explaining the latest and greatest YOLO object detection, it is worth to understand the evolution of object detection to appreciate the contribution of YOLO.\n\nImage Classification\nThe image classification is given an input image, presenting to CNN, predicts a single class label with the probability that described the confidence that the prediction is correct. The class with the highest probability is the predicted class. This class label is meant to characterize the contents of the entire image; it does not localize where the predicted class appeared in the image.\n\nFor example, given the input image in Figure below (left), our CNN has labeled the image as “hot-dog”.\n\n\n\nObject Detection\nAs for object detection, builds on top of image classification and seeks to localize exactly where in the image each object appears.\n\nWhen performing object detection, given an input image, we wish to obtain:\n\n\n  A list of bounding boxes, or the (x, y)-coordinates for each object in an image\n  The class label associated with each bounding box\n  The probability/confidence score associated with each bounding box and class label\n\n\nFigure above (right) demonstrates an example of performing Deep Learning object detection. Notice how the “hamburger” and “french-fries” are separately classified and localized with bounding boxes.\n\nTherefore, object detection is the additional algorithmic complexity on top of image classification.\n\nReal-Time Object Detection\nIn addition to object detection, the ultimate challenge is how fast the detection can be done. To reach acceptable “real-time” performance, the expectation is at least 15 fps (frames per second), i.e. running the object classification and localization at ~67 ms per image.\n\nHello, Darknet’s YOLO\nFor the longest time, the detection systems repurpose classifiers or localizers to perform object detection. They apply the model to an image at multiple locations and scales. High scoring regions of the image are considered detections.\n\nThanks to Joseph Redmon’s Darknet implementation https://pjreddie.com/darknet/yolo/, YOLO uses a totally different approach. It applies a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities. Interest reader should study the original “You Only Look Once: Unified, Real-Time Object Detection” pap..."
  },
  
  {
    "title": "FHIR Server Up and Running",
    "url": "/fhir-server-up-and-running",
    "date": "Apr 15, 2018",
    "categories": ["post"],
    "tags": ["FHIR","Electronic Health Record","Medical","Big Data","Blockchain"],
    "excerpt": "\nBlockchain is a hot topic for making patient’s Electronic Health Record both accessible and safe, talking about the dream of patients finally own their complete medical history, drugs list, lab te...",
    "content": "\nBlockchain is a hot topic for making patient’s Electronic Health Record both accessible and safe, talking about the dream of patients finally own their complete medical history, drugs list, lab test results, doctor notes etc. But there is a fundamental problem - Where is the data coming from? Even the medical providers are digitizing the patient’s data deligently, the data remains inaccessible beyond their database boundary.\n\n\nThere are many attempts to bridge this big data chasm. The interest reader should take the Coursera’s Health Informatics on FHIR https://www.coursera.org/learn/fhir to understand the history, the problem and how FHIR could keep the Blockchain dream alive in sharing the medical records. This article takes the Blockchain believers, as Kierkegaard famously termed “A leap of faith”, to setup a functional FHIR server for research and development.\n\nFHIR Me Up!\nFHIR is a platform specification that defines a set of capabilities use across the healthcare process, in all jurisdictions, and in lots of different context. This document will guide you through the installation, setup and testing of the FHIR Server. With the FHIR Server, the provider’s healthcare data can be made easily accessible with the standardized RESTful API and JSON formatted resources.\n\nIn addition, to limit the scope of this setup, the topics of security are not discussed.\n\nQuick Links\nThe reference technology stack links:\n\n  Official FHIR Website: http://hl7.org/fhir\n  Hapi FHIR - The Open Source FHIR API for Java: http://hapifhir.io/\n  SMART on FHIR - The Open App Connection Standard to FHIR Server: http://docs.smarthealthit.org/\n\n\nThe suite of tools are used and their links:\n\n  STU3 Sandbox Data: http://docs.smarthealthit.org/data/stu3-sandbox-data.html\n  Sample Patient Generator for STU3: https://github.com/smart-on-fhir/sample-patients-stu3\n  Add Tags to FHIR Bundle and Resources Uploader: https://github.com/smart-on-fhir/tag-uploader\n  SMART on FHIR Patient Browser: https://github.com/smart-on-fhir/patient-browser\n\n\nSteps Overview\nThe following steps will guide you through the server installation, data generation and UI testing. The instructions are proven to work on a Mac (OS X El Capitan).\n\nCompile Hapi FHIR Packages\nGo download Hapi FHIR Server http://hapifhir.io/download.html. At the time of this writing, the latest DSTU3 (Draft Standard for Trial Use 3) is stable.\nWe shall build our server and data upon this latest released standard.\n\nHAPI is built primary using Apache Maven. Even if you are using an IDE, you should start by performing a command line build before trying to get everything working in an IDE.\n\nExecute the build with the following command:\n\nmvn install\n\n\nNote that this complete build takes a long time because of all of the unit tests being executed. At the end you should expect to see a screen resembling:\n\n\nSetup Hapi FHIR JPA Server\nAfter the Hapi FHIR Server has been compiled, we can setup the JPA Server example located in hapi-fhir-jpasserver-example.\n\nWe shall edit the root URL context to a shorter name hapi-fhir (i.e. make the URL shorter).\n\nvi pom.xml\n\n\nIn the section of Jetty server plug-in config, we shall change the contextPath XML value to /hapi-fhir.\n\n&lt;!-- The following is not required for the application to build, but allows you to test it by issuing \"mvn jetty:run\" from the command line. --&gt;\n&lt;pluginManagement&gt;\n    &lt;plugins&gt;\n        &lt;plugin&gt;\n            &lt;groupId&gt;org.eclipse.jetty&lt;/groupId&gt;\n            &lt;artifactId&gt;jetty-maven-plugin&lt;/artifactId&gt;\n            &lt;configuration&gt;\n                &lt;webApp&gt;\n                    &lt;contextPath&gt;/hapi-fhir&lt;/contextPath&gt;\n                    &lt;allowDuplicateFragmentNames&gt;true&lt;/allowDuplicateFragmentNames&gt;\n                &lt;/webApp&gt;\n            &lt;/configuration&gt;\n        &lt;/plugin&gt;\n    &lt;/plugins&gt;\n&lt;/pluginManagement&gt;\n\n\nStart Hapi FHIR JPA Server\nAfter editing the pom.xml, we are ready to start the Hapi FHIR Server by the embedded Jetty webserver,\n\nmvn jetty:run\n\n\nThe Hapi FHIR Server example UI can be found at:\n\nhttp://localhost:8080/hapi-fhir\n\n\n\n\nThe Hapi FHIR’s RESTful API can be access through this root URL:\n\nhttp://localhost:8080/hapi-fhir/baseDstu3\n\n\nWe can use the Hapi FHIR Server UI for testing, for example, to retrieve a specific patient ID smart-1032702 as illustrated,\n\n\n\nBut wait, we need to populate with some sample data first. The following section will generate some fake (but realistic) data set.\n\nGenerate Sample Data Set\nRunning a server without health records is useless. In order to fiil the server with some realistically fake data, we shall use the Sample Patient Generator for STU3 https://github.com/smart-on-fhir/sample-patients-stu3\n\nThe primary purpose of this tool is to generate FHIR STU3 transaction bundles as JSON files. Once generated these bundles can be inserted into any compatible FHIR server using it’s API.\n\nThis generato..."
  },
  
  {
    "title": "SingularityNET AI Service Integration",
    "url": "/singularitynet-ai-service-integration",
    "date": "Feb 18, 2018",
    "categories": ["post"],
    "tags": ["AI","Blockchain","SingularityNET","Machine Learning"],
    "excerpt": "\nWith the advent of AI and Blockchain technology and its exponential impact on business, a recently released open-source project SingularityNET https://singularitynet.io/ is truly revolutionary by ...",
    "content": "\nWith the advent of AI and Blockchain technology and its exponential impact on business, a recently released open-source project SingularityNET https://singularitynet.io/ is truly revolutionary by combining both technologies into a decentralized market of coordinated AI services being backed by Blockchain’s smart contracts. Within the SingularityNET platform, the benefits of AI become a global commons infrastructure for the benefit of all; anyone can access AI tech or become a stakeholder in its development. Anyone can add an AI/machine learning service to SingularityNET for use by the network, and receive network payment tokens in exchange.\n\n\nToday, the only technical information is available through their white paper (Dec 19, 2017) at https://public.singularitynet.io/whitepaper.pdf that gives a glimpse into their technical details. After struggling for a few days, I discovered an approach to consume their technical information by hacking with the SingularityNET source code. This article summarizes how to implement an AI service provider integration with SingularityNET’s service wrapper API. The example AI service is using MNIST image classification implemented in Tensorflow.\n\n\n\nFigure. SingularityNET high-level system architecture, which illustrates how the platform supports AI agent to agent interactions and uses Blockchain smart contract to record the transactions.\n\nThis article will go through the steps to experiment with SingularityNET’s AI agent integration: creating a virtual environment, checking out the source code, running the MNIST Tensorflow agent example, explaining service adapter development, and showing service integration by configuration.\n\nObviously, there are many more SingularityNET topics, which must be explored by other articles.\n\n\n  SingularityNET is moving forward with another alpha release. This article is out-dated as of 2018-05-03; however the information can still be useful to understand the functional and implementational views of cooperating AI agents. For the latest SingularityNET development, reader can refer to Wiki at &lt;https://github.com/singnet/wiki/wiki&gt;.\n\n\nCreate a virtual environment\nVirtual environments make it easy to separate different projects and avoid problems with different dependencies and version requirements across components. In the terminal client enter the following where envname is the name you want to call your environment, and replace x.x with the Python version you wish to use.\n\nconda create -n envname python=x.x anaconda\n\n\nFrom this point forward, we shall use the singnet as our environment name.\n\nconda create -n singnet python=3.6 anaconda\n\n\nTo activate or switch into your virtual environment, simply type singnet is the name you gave to your environment at creation.\n\nsource activate singnet\n\n\nSingularityNET Source Code\nThe breeding edge SingularityNET code can be found at https://github.com/singnet/singnet\n\nChecking out the code into your local file system,\n\ngit clone https://github.com/singnet/singnet.git\n\n\nAfter checking out, the article assumes the code is located in singnet directory.\n\nInstall SingularityNET Agent Requirements\n\ncd singnet/agent\npip install -r requirements.txt\n\n\nThere is a long list of installations.\n\n(singnet) [bcheung@Benny-Cheung:agent] pip install -r requirements.txt\nCollecting aiohttp (from -r requirements.txt (line 1))\n  Downloading aiohttp-3.0.1-cp36-cp36m-macosx_10_11_x86_64.whl (371kB)\n    100% |████████████████████████████████| 378kB 1.0MB/s\n...\nSuccessfully installed Jinja2-2.10 MarkupSafe-1.0 PyYAML-3.12 Pygments-2.2.0 aiohttp-3.0.1 aiohttp-cors-0.6.0 aiohttp-jinja2-0.16.0 alabaster-0.7.10 argh-0.26.2 async-timeout-2.0.0 attrs-17.4.0 babel-2.5.3 bleach-1.5.0 bson-0.5.2 cchardet-2.1.1 chardet-3.0.4 commonmark-0.5.4 coverage-4.5.1 coveralls-1.2.0 cytoolz-0.9.0 docopt-0.6.2 docutils-0.14 eth-abi-0.5.0 eth-keys-0.1.0b4 eth-tester-0.1.0b11 eth-utils-0.8.0 feedparser-5.2.1 fire-0.1.2 funcsigs-1.0.2 future-0.16.0 html5lib-0.9999999 hvac-0.4.0 idna-2.6 idna-ssl-1.0.0 imagesize-1.0.0 isodate-0.6.0 jsonrpcclient-2.5.2 jsonrpcserver-3.5.3 jsonschema-2.6.0 livereload-2.5.1 lru-dict-1.1.6 markdown-2.6.11 mock-2.0.0 multidict-4.1.0 numpy-1.14.0 packaging-16.8 pathtools-0.1.2 pbr-3.1.1 pluggy-0.6.0 port-for-0.3.1 protobuf-3.5.1 py-1.5.2 pyaml-17.12.1 pyparsing-2.2.0 pysha3-1.0.2 pytest-3.4.0 pytest-cov-2.5.1 pytz-2018.3 rdflib-4.2.2 recommonmark-0.4.0 requests-2.18.4 rlp-0.6.0 semantic-version-2.6.0 six-1.11.0 snowballstemmer-1.2.1 sphinx-1.7.0 sphinx-autobuild-0.7.1 sphinx-rtd-theme-0.2.4 sphinxcontrib-websupport-1.0.1 tensorflow-1.3.0 tensorflow-tensorboard-0.1.8 toolz-0.9.0 tornado-4.5.3 urllib3-1.22 uvloop-0.9.1 watchdog-0.8.3 web3-3.16.5 werkzeug-0.14.1 yarl-1.1.1\n\n\nLife seems good that every requirements are installed without hipcup, which is a rare event in a hacking experience.\n\nDocker Prerequisites\nAccording to the SingularityNET official website, SingularityNET runs on Mac OS X, or any Linux which has Python 3 inst..."
  },
  
  {
    "title": "Ethereum Mining on Windows 10",
    "url": "/ethereum-mining-on-windows-10",
    "date": "Sep 17, 2017",
    "categories": ["post"],
    "tags": ["Mining","Ethereum","Crypto Currency","GPU"],
    "excerpt": "\nThe value of dedicated GPU is going beyond the needs of gaming, it is proven to fulfill the professional needs for Deep Learning researches.\nAs it turns out the modern graphics cards are very good...",
    "content": "\nThe value of dedicated GPU is going beyond the needs of gaming, it is proven to fulfill the professional needs for Deep Learning researches.\nAs it turns out the modern graphics cards are very good at achieving the framerate requirements for virtual-reality. To our biggest surprise, GPU is profitable at mining crypto-currency again, such as Ethereum, so we can profit from our current hardware setup. This article will run through the basic mining knowledge, and guides how to setup a Windows 10’s machine with a GPU to do Ethereum mining.\n\n\nBitcoin Mining History\nA few years ago, after reading the paper by pseudonym author “Satoshi Nakamoto” on Bitcoin: A Peer-to-Peer Electronic Cash System describing a distributed trust solution to the open network, the sense of computational beauty about it’s concepts and solutions propelled my interest to try out Bitcoin mining. The value of Bitcoin is approximately $120 USD at 2013, which is impressive for it’s growth from zero value within a short few years of Bitcoin invention.\n\nTo reduce the risk of the adventure, like everyone at that time, we started out by utilizing the gaming GPU to reach a profitable hashrate (to be explained later).\nThe profitability is measured by how much coin value that you generated against how much electricity that you paid. If the coin value is higher then the cost of electricity, you are in a profitable mining business, excluding the initial equipment cost.\nThe original Bitcoin hashing algorithm (SHA256), performing a complex but fixed sequence of operation can be replaced by ASIC (Application Specific Integrated Circuit) dedicated hardware. The computation speed and power efficiency of ASIC essentially killed the profitability of using GPU based mining quickly. In order to stay in the Bitcoin mining game, Adafruit’s instruction to build a Raspberry Pi Bitcoin Miner, which can hash at 2 GH/s with extremely low power consumption (see the following picture), will keep mining alive for a little longer.\n\n\n\nFigure. Raspberry Pi Mining Rig running at 2 GH/s, the Raspberry Pi has attached to Adafruit 16x2 LCD + Keypad Kit to display mining statistics. The row of ASICMiner Block Erupter USB sticks, each is capable of 300 MH/s. The power consumption for each Block Erupter will be 5W, making the miner very profitable at 2013. But it is useless at 2017.\n\nFor the history of Bitcoin mining, the rise of ASIC dedicated hardware basically destroyed all GPU-based mining profitability by the exponential grow of the Bitcoin Network Difficulty.\n\nEthereum Mining\nUntil recently Ethereum, which is believed to be Bitcoin 2.0, specifically designed it’s Ethash as Memory Hardness Algorithm to make ASIC hardware based solution impractical; subsequently, Ethash has made GPU-based mining profitable again. Also, adding to the miner incentive, the Ethereum trading price has enjoyed some significant growth. Ethereum not only shows it’s profitability but also the Ethereum trading liquidity as a popular cryptocurrency for the future.\n\nAs a Ethereum Miner and Researcher, there are few essential concepts must be clarified for the purpose of mining. If you don’t care about the beauty of computing, you can skip ahead to the section of “Road to Mining”.\n\nBlockchain for Distributed Trust\nThe key cryptocurrency technology is Blockchain. The Blockchain “locked-in” all the transactional facts into space and time, such that modification will be almost impossible in the open and distributed network. The “space” is the content and “time” is the sequence of contents. Imagine a Block content is the facts about money transfers, the Block’s hash value will provide an unique signature (or digest) to tell if the Block’s content is authentic. If the Block’s content has been tempered in anyway, the hash value will be different so that everyone knows that the content cannot be trusted. Since the Blockchain has been distributed throughout the open network, other untempered copy of the Blockchain can continue to maintain the truth. The tempered Blockchain will simply be ignored.\n\n\n\nFigure. This picture explains the Blockchain Security by showing the current Block has the hash value of the previous Block. If the previous Block content has been modified, the hash value will be different.\n\nIn addition, this is impossible to reverse engineering the content to produce the same hash value, as if modifying the content enough to fake the hash value. This is the beauty of the Cryptographic Hash Function; even if you changed a single character, the hash value will be changed so significantly such that you have no way to control the changes.\n\nMining as Proof of Work for Values\nFor the Blockchain of cryptocurrency, e.g. Bitcoin or Ethereum, the Block’s content will be the transfer of coins. Imagine the following Blocks for the coin transactions, we can always trace back the origin of a coin. For example, the coin sent by \\(A\\) can be traced back in time through the Blockchain connectivity, where the total..."
  },
  
  {
    "title": "Deep Transfer Learning on Small Dataset",
    "url": "/deep-transfer-learning-on-small-dataset",
    "date": "Mar 14, 2017",
    "categories": ["post"],
    "tags": ["Deep Learning","Neural Network","Transfer Learning","Keras"],
    "excerpt": "\nThe success of Convolutional Neural Network (ConvNet) application on image classification relies on two factors (1) having a lot of data (2) having a lot of computing power; where (1) having data ...",
    "content": "\nThe success of Convolutional Neural Network (ConvNet) application on image classification relies on two factors (1) having a lot of data (2) having a lot of computing power; where (1) having data seems to be a harder issue. Data acquisition is generally the major costs of any realistic project. But if we only can afford a small dataset, can we still use ConvNet effectively?\n\n\nThe Transfer Learning technique is using an existing ConvNet feature extraction and the associated trained network weights, transferring to be used in a different domain. The researches indicated the ConvNet exploits the hierarchical distributed representations. The lower layers of a ConvNet contain more generic features (e.g. edge detectors or color blob detectors) that should be useful to many tasks, but higher layers of the ConvNet becomes more specific to the details of the domain classes contained in the original dataset.\n\n\n\n  [2024/10/07] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/deep-transfer-learning-on-small-dataset/Podcast_Deep_Transfer_Learning.mp3\n\n\n\n\n\n\nThe Keras Blog on “Building powerful image classification models using very little data” by Francois Chollet is an inspirational article of how to overcome the small dataset problem, with transfer learning onto an existing ConvNet. Since modern ConvNets take 2-3 weeks to train across multiple GPUs on ImageNet (which contains 1.2 million images with 1000 categories), it is common to see people release their final ConvNet trained weight for the benefit of others who can use the networks for fine-tuning. For example, the Caffe library has a Model Zoo where people share their network weights. However, the higher layers weights may need to be fine-tuned. In the case of network using ImageNet dataset which contains many dogs and cats, a significant portion of the network weights may be devoted to features that are specific to differentiate between breeds of dogs in the higher layers, which are not as useful to a different domain. The transfer learning strategy must take into consideration.\n\n\n\nFigure. VGG16 ConvNet Fine-Tuning Technique for adapting to different domain\n\nA good transfer learning strategy is outlined as following steps:\n\n  Freezing the lower ConvNet blocks (blue) as fixed feature extractor. Take a ConvNet pretrained on ImageNet, remove the last fully-connected layers, then treat the rest of the ConvNet as a fixed feature extractor for the new dataset. In an VGG16 network, this would compute a 4096-D vector for every image that contains the activations of the hidden layer immediately before the classifier. These features are termed as CNN codes.\n  Training the new fully-connected layers (green, aka. bottleneck layers). Extract the CNN codes for all images, train a linear classifier (e.g. Linear SVM or Softmax classifier) for the new dataset.\n  Fine-tuning the ConvNet. Replace and retrain the classifier on top of the ConvNet on the new dataset, but to also fine-tune the weights of the pretrained network by continuing the back-propagation to part of the higher layers (yellow+green).\n\n\nTransfer Learning Experiments\nThis section reports 3 experiments that applying the previous outlined transfer learning strategy. For each experiment, there is a specific question that we must learn.\n\n  Kaggle’s Dogs vs Cats - to learn how to use the techniques.\n  Oxford’s 102 Category Flower - to answer if large number of catergories can be adapted.\n  UEC’s Food 100 - to answer if food/snack watching can be adapted.\n\n\nExperiment Setup\nThe previous blog posts on Deep Style Transfer and Deep Dream have served to instruct how to setup on Windows 10.\nThese experiments are setup using NVidia GTX 1070 GPU with CUDA 8.0 running on Windows 10.\nAll softwares are written in Python using Keras configured to use Theano backend.\n\n\n  Due to the incompatibility of the latest CUDA 8.0 and the latest Keras version, the following step helps to downgrade Keras to 1.2.0 if needed.\n\n\nDowngrade to Keras 1.2.0\nIn order to following the “image classification” blog’s code here,\njust downgrade to Keras 1.2.0 solved the running issue.\n\nFind Keras version,\npython -c \"import keras; print keras.__version__\"\n\n\nIf the reported Keras version is greater than 1.2.0, follow these steps to downgrade.\n\npip uninstall keras\npip install keras==1.2.0\n\n\n\nExperiment 1: Dogs vs Cats Dataset\nFollowing the article\n“Building powerful image classification models using very little data”,\nthe two sets of pictures, which downloaded from Kaggle: 1000 cats and 1000 dogs (extracted from the original dataset which had 12,500 cats and 12,500 dogs, only the first 1000 images for each class is used). We also use 400 additional samples from each class as validation data, to evaluate the models.\n\nHere are some sample images of the “Dog vs Cat” dataset. Some images are definitely challenging, for example, the animal is partial..."
  },
  
  {
    "title": "Deep Dream with Caffe on Windows 10",
    "url": "/deep-dream-on-windows-10",
    "date": "Mar 03, 2017",
    "categories": ["post"],
    "tags": ["Deep Dream","Neural Network","Python","GPU"],
    "excerpt": "\nDeep Dream is an algorithm that makes an pattern detection algorithm over-interpret patterns. The Deep Dream algorithm is a modified neural network. Instead of identifying objects in an input imag...",
    "content": "\nDeep Dream is an algorithm that makes an pattern detection algorithm over-interpret patterns. The Deep Dream algorithm is a modified neural network. Instead of identifying objects in an input image, it changes the image into the direction of its training data set, which produces impressive surrealistic, dream-like images.\n\n(read the original Google blog https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html)\n\nThe result are beautiful hallucinations like the one below. The neural network amplified the perceived objects that it is being trained to recognized. I wondered if this is the same effects on our visual cortex subdued by some drugs?\n\n\n\nFigure. Vangogh’s “Starry Night” Deep Dream transformation\n\nThis article is a continuation of my previous blog on Deep Learning with GPU on Windows 10. You may want to read how to setup NVidia CUDA 8.0 to utilize your GPU for speeding up Deep Dream calcuations.\n\nCaffe\nhttp://caffe.berkeleyvision.org\n\nCaffe is perhaps the first mainstream industry-grade deep learning toolkit, started in late 2013, due to its excellent convolutional neural network implementation (at the time). It is still the most popular toolkit within the computer vision community, with many extensions being actively added. Especially, there are many popular pre-trained neural network (aka. Caffe Model Zoo) can be easily downloaded and used.\n\nThe Deep Dream script is using Google’s award winning entry of ILSVRC 2014 GoogLeNet, a 22 layers deep network trained to regconize images. (explained in http://joelouismarino.github.io/blog_posts/blog_googlenet_keras.html)\nGoogLeNet achieved the classification of the ImageNet dataset (all sorts of animals, household objects, vehicles, etc.), 93.33% of the time the correct object class will be contained in the GoogLeNet ensemble’s top five predictions.\n\n\n\nFigure. Showing the GoogLeNet CNN 22 layers deep network\n\nThe Caffe’s pre-trained model that we downloaded and used is the iteration 2,400,000 snapshot (60 epochs) using quick_solver.prototxt.\n\n\n  \n    \n      name\n      caffemodel\n      caffemodel_url\n    \n  \n  \n    \n      BVLC GoogleNet Model\n      bvlc_googlenet.caffemodel\n      http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel\n    \n  \n\n\nFurther Caffe info can be found:\n\n  Caffe Tutorial Documentation\n  BVLC Reference Models and the Community Model Zoo\n\n\nPrepare Python Virtual Environment\nhttps://www.continuum.io/downloads\n\nNote: this article assumes you are using bash shell on Windows.\nThe recommended bash shell comes from Git for Windows https://git-for-windows.github.io installation.\n\nYou can install either Anaconda Python 2.7 or Anaconda Python 3.5 on Windows 10. Later, we would create a virtual environment to isolate our Deep Dream tools installation. We shall refer to your Anaconda installation location as {path_to_anaconda_location} later.\n\nDefine Anaconda Virtual Environment\nBy experience, hacking on a new tools suite are usually messy and full of conflicts. Using an isolated Python virtual environment will protect you from headaches and disaster of installations.\nIn bash shell, enter the following where caffe (or your choice of name) is the name of the virtual environment, and python=2.7 is the Python version you wish to use.\n\nconda create -n caffe python=2.7 anaconda\n\n\nPress y to proceed. This will install the Python version and all the associated anaconda packaged libraries at {path_to_anaconda_location}/envs/caffe\n\nInstall Support Packages on Virtual Environment\nOnce the caffe virtual environment has been installed, activate the virtualenv by\n\nsource activate caffe\n\n\nContinue to install all the shell script commands,\n\nconda install boost\nconda install mingw libpython\n\n\nThen install Caffe’s dependencies\n\nconda install --yes numpy scipy matplotlib scikit-image pip six\n\n\nalso you will need a Google’s protobuf python package that is compatible with pre-built dependencies. This package can be installed this way:\n\nconda install --yes --channel willyd protobuf==3.1.0\n\n\nInstall Caffe on Windows 10\nThe lazy way to install Caffe on Windows 10 is downloading the prebuilt binaries from Caffe’s Windows branch on Github:\nhttps://github.com/BVLC/caffe/tree/windows\n\nFor my Windows 10 setup, I can choose either of these,\n\n  Visual Studio 2015, CUDA 8.0, Python 3.5: Caffe Release (64 bits)\n  Visual Studio 2015, CUDA 8.0, Python 2.7: Caffe Release (64 bits)\n\n\nSince we have created the Anaconda Python 2.7 virtual environment to host our experiment, we choose to install Visual Studio 2015, CUDA 8.0, Python 2.7: Caffe Release package.\n\nAfter completing the install, ensure to add the following into your Windows’s environment variable, {path_to_caffe} refers to Caffe’s installation.\n\nPYTHONPATH={path_to_caffe}\\caffe\\python\n\n\nDeep Dream Python Script\n\nInstall Deep Dream Script\nYou can clone the Deep Dream script from GitHub repository\nhttps://github.com/bennycheung/PyDeepDream\n\ngit clone https://github.com/bennycheung/PyDeepDream.gi..."
  },
  
  {
    "title": "Deep Learning with GPU on Windows 10",
    "url": "/deep-learning-on-windows-10",
    "date": "Feb 24, 2017",
    "categories": ["post"],
    "tags": ["Deep Learning","Neural Network","Python","GPU"],
    "excerpt": "\nYou just got your latest NVidia GPU on your Windows 10 machine.\nOther than playing the latest games with ultra-high settings to enjoy your new investment,\nwe should pause to realize that we are ac...",
    "content": "\nYou just got your latest NVidia GPU on your Windows 10 machine.\nOther than playing the latest games with ultra-high settings to enjoy your new investment,\nwe should pause to realize that we are actually having a supercomputer able to do some serious computation.\nA Deep Learning algorithm is one of the hungry beast which can eat up those GPU computing power.\n\n\nUnfortunately, the Deep Learning tools are usually friendly to Unix-like environment.\nWhen you are trying to start consolidating your tools chain on Windows, you will encounter many difficulties.\nI spent days to settle with a Deep Learning tools chain that can run successfully on Windows 10.\nHere is the summary of my selection and installation procedure. If you have the endurance to complete, towards the end of this article, you can run neural style transfer to create “deep” and impressive image\n(The original paper “A Neural Algorithm of Artistic Style” can be found at https://arxiv.org/abs/1508.06576)\n\n\n\nFigure. Grid of sample results after running neural style transfer algorithm on a self-portrait\n\nJust a quick overview, the pre-requisite dependencies that we shall install,\n\n  Scipy + PIL (install via Anaconda Python)\n  Numpy (install via Anaconda Python)\n  CUDA (GPU) (install via NVidia package)\n  CUDNN (GPU) (optional, install via NVidia package)\n  PyCUDA (install via prebuilt binaries)\n  Theano (install via pip)\n  Keras (install via pip)\n\n\nInstall Theano under Anaconda Python (Windows 10)\nhttp://deeplearning.net/software/theano/\n\nTheano is one of the popular Deep Learning framework,\nwhich has a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.\n\nNote: this article assumes you are using bash shell on Windows.\nThe recommended bash shell comes from Git for Windows https://git-for-windows.github.io installation.\nYou will need git to checkout interesting projects, source codes in your “mind hacking” career.\nSo, this is the right time to hit yourselves with the right power tool.\n\nAnaconda Python\nhttps://www.continuum.io/downloads\n\nAnaconda is an easy-to-install free package manager, environment manager, Python distribution, and collection of over 720 open source packages offering free community support.\n\nYou can install either Anaconda Python 2.7 or Anaconda Python 3.5 on Windows 10. Later, we would create a virtual environment to isolate our Deep Learning tools installation. We shall refer to your Anaconda installation location as {path_to_anaconda_location} later.\n\nDefine Anaconda Virtual Environment\nBy experience, hacking on a new tools suite are usually messy and full of conflicts. Using an isolated Python virtual environment will protect you from headaches and disaster of installations.\nIn bash shell, enter the following where theano (or your choice of name) is the name of the virtual environment, and python=2.7 is the Python version you wish to use.\n\nconda create -n theano python=2.7 anaconda\n\n\nPress y to proceed. This will install the Python version and all the associated anaconda packaged libraries at {path_to_anaconda_location}/envs/theano\n\nInstall Theano on Virtual Environment\nOnce the theano virtual environment has been installed, activate the virtualenv by\n\nsource activate theano\n\n\nContinue to install all the shell script commands,\n\nconda install boost\nconda install mingw libpython\n\n\nThen, install theano\n\npip install theano\n\n\nAdd the following lines to $HOME/.theanorc file\n\n[global]\nfloatX = float32\ndevice = gpu\n\n[nvcc]\nflags=-L{path_to_anaconda_location}\\envs\\theano\\libs\ncompiler_bindir=C:\\Program Files (x86)\\Microsoft Visual Studio 12.0\\VC\\bin\n\n\nSince I have created the virtual environment for Theano, you can see that flags is pointing to that virtual environment libs.\nI read that we have to use VS 12.0 in order to compile CUDA. The setting seems to be working with my CUDA 8.0 and Visual Studio 2015 community installation, and I did not bother to investigate if it works for the other Visual Studio versions.\n\nInstall PyCUDA\nCUDA is a parallel computing platform and programming model invented by NVidia.\nIt enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).\nYou will need to download and install CUDA development toolkit from NVidia https://developer.nvidia.com/cuda-downloads.\n\nPyCUDA is the Python wrapper of the CUDA API to allow writing CUDA code in Python.\nIn our case, the painless way to install PyCUDA is using the prebuilt Windows 10 binaries. Follow this link to download:\n\nhttp://www.lfd.uci.edu/~gohlke/pythonlibs/?cm_mc_uid=67672146725714876308558&amp;cm_mc_sid_50200000=1487630855#pycuda\n\nFor my version of CUDA 8.0, Python 2.7 for Windows 10 64 bits, the package will be:\npycuda‑2016.1.2+cuda8044‑cp27‑cp27m‑win_amd64.whl\n\nThen, using pip to install this package\n\npip install pycuda‑2016.1.2+cuda8044‑cp27‑cp27m‑win_amd64.whl\n\n\nTesting Install\nFire up Python with interactive shell -i.\n\npyt..."
  },
  
  {
    "title": "Model of Spatial Construction",
    "url": "/model-of-spatial-construction",
    "date": "Jul 05, 2016",
    "categories": ["post"],
    "tags": ["Spatial Reasoning","Voronoi","Delaunay","Tessellation"],
    "excerpt": "\nThe heart of a spatial reasoning system is utilization of spatial knowledge.\nWith proper represented spatial knowledge, the task of spatial\nreasoning is made intuitive, flexible and practical.\nThi...",
    "content": "\nThe heart of a spatial reasoning system is utilization of spatial knowledge.\nWith proper represented spatial knowledge, the task of spatial\nreasoning is made intuitive, flexible and practical.\nThis article introducing the concept of Spatial Construction,\nwhich is the theory of deriving the spatial knowledge.\nThe strength of the theory is it’s intuitive foundation\nthat allows flexible and practical algorithmic construction\nof spatial knowledge. We shall concentrate on setting up\nthe foundation of spatial construction.\nThe other concern for spatial reasoning, which is how to utilize the spatial knowledge effectively and efficiently, we leave that until another article to elaborate.\n\n\nTowards the end, we shall introduce\nPyDelaunay (https://github.com/bennycheung/PyDelaunay) as\na practical spatial construction. It is an efficient\nPython implementation of Voronoi/Delaunay Tessellation,\nwhich can be served as the Basic Map\nof the neighbourhood concept.\n\n\n\n  [2024/10/04] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/model-of-spatial-construction/Podcast_Model_of_Spatial_Construction.mp3\n\n\n\n\n\n\nModel of Spatial Reasoning\nSpatial reasoning is described in terms of\na generic solution to many spatial problems.\nConsidering that spatial knowledge is represented as\na set of entities and a set of relations,\nspatial reasoning is a knowledge increasing process which is designed to\nuse spatial primitives to deduce\nmore and more specific relations between any two entities.\nFor example, in an environment \\(G_s\\),\nthe possible solution space may be described\nby all possible connectivities between \\(v_a\\) and \\(v_b\\).\n\nAlthough it is not a specific relation between \\(v_a\\) and \\(v_b\\),\nthe spatial reasoning process tends to produce more specific relations between\n\\(v_a\\) and \\(v_b\\).\nIn this case,\n“\\(v_a\\) is related to \\(v_b\\) via \\(v_c\\)”,\nis a new relation and it is more specific than the complete graph\nconnections.\n\nThe deduction of this specific relation may be a knowledge intensive process\nor pure algorithmic process. In each case, however,\nthis newly deduced relation is defined as a spatial reasoning step.\nAfter the deduction of this new spatial relation, it may be immediately recorded\ninside the knowledge base. The new piece of knowledge may enhance other\nsteps of spatial reasoning as long as the relation,\n“\\(v_a\\) is related to \\(v_b\\) via \\(v_c\\)”,\nstill holds in the knowledge base.\n\nSpatial reasoning can be further divided into two subprocesses:\nSpatial Construction and Spatial Utilization.\nSpatial construction is the foundation for all spatial reasoning\nactivities since it provides ways to represent space and spatial semantics.\nSpatial utilization is the actual process,\nusing spatial constructs and spatial primitives,\nto construct some very specific relations.\nSpatial construction can be viewed as a generic representation\nfor any application but the spatial utilization is more application specific.\nThe application may impose its domain knowledge\nwhich affects spatial utilization.\nAlthough this model separates these two subprocesses,\nthey should be viewed as complementary processes to each other\nas illustrated.\n\n\n\nAfter spatial construction, the base knowledge can be immediately\napplied in the utilization step. After the utilization step has deduced\nnew traversal information, it may reconstruct some of the base knowledge.\nTo start the construction process, an external procedure is needed.\nThis procedure may be a predefined knowledge base or some knowledge\nacquisition process.\n\nThe traditional framework of reasoning consists of two important parts,\nwhich are the knowledge base and the inference engine.\nThe definition of spatial reasoning as a process does not deviate\nfrom this fundamental construct but elaborates on the structure by\nincorporating specific processes.\nThese specific processes are driven by intensive spatial knowledge.\nTherefore, this model is not a new type of reasoning mechanism but\na new process control to serve as a supporting mechanism for the needs\nof spatial reasoning.\n\nSpatial Construction\nThe natural grouping of spatial entities is the basis of a\nspatial construction concept.\nThe original reasoning scheme, which separates knowledge base and\ninference engine, is still applicable.\nHowever, the new proposed scheme suggests that the spatial knowledge base\nis indexed by its inherent spatial knowledge,\nas illustrated.\n\n\n\nThis basic indexed space is called the Basic Map.\nThis basic map not only captures the neighbourhood nature of space\nbut also provides a fast processing structure for geometric information.\n\nAfter a basic map is constructed,\na variety of other maps (including topology, structure, etc.)\ncan also be constructed from\nthe multi-facet and hierarchical nature of space.\nMoreover, based on the requirements of applications,\nspecialized maps can also be derived.\nThe notion of ..."
  },
  
  {
    "title": "Spatial Reasoning Explained",
    "url": "/spatial-reasoning-explained",
    "date": "Jun 08, 2016",
    "categories": ["post"],
    "tags": ["Spatial Reasoning","AI","Prolog","Logic"],
    "excerpt": "\nSpatial Reasoning is a logical reasoning system that assumed entities located in space and have a spatial structure.\nMaking machines that can perceive and understand space has always been\na resear...",
    "content": "\nSpatial Reasoning is a logical reasoning system that assumed entities located in space and have a spatial structure.\nMaking machines that can perceive and understand space has always been\na researcher’s dream. Our lives could be enhanced by their assistance;\nwith the spatial intelligence of machines, we would have new methods of\nplanning and navigation through spatial orientation.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/spatial-reasoning-explained/Podcast_Spatial_Reasoning_Explained.mp3\n\n\n\n\n\n\nTopics of Spatial Reasoning\nComputer science involves a list of interesting topics that required spatial reasoning ability:\n\n\n  \n    Artificial Intelligence -\nOur everyday concepts are populated with qualitative spatial description. Something is sitting next to something else.\nThe description is usually fuzzy and imprecise, yet\nwe can deduce some non-trivial conclusion.\nFor example, a train is going west. If the station exit is closer\nto the south, where would we sit on the train in order to exit faster?\nThis involved the spatial structure of the train and qualitative understanding of the direction that train is travelling.\n  \n  \n    Spatial Database -\nIn addition to well-known numeric or character objects, the “spatial object “ is a popular extension to the collection of the “database objects”.\nWhen the data has a spatial attribute, for example a street address or a phone number,\nwe could use the spatial proximity to calculate and deduce more info.\nWe are designating a finite geometric representation and operation to these spatial entities.\n  \n  \n    Image Recognition -\nWhen an object appears in the visual field, it’s of great interest to image recognition application.\nThe object usually occurred in a spatial context, for example, the snacks\nsitting on top of a table. The snacks are described geometrically and spatially to precisely pin-point their location and spatial occupation.\n  \n  \n    Linguistic Processing -\nIn communication, information is usually accompany with a spatial context.\nFor example, battlefield logistics rely on spatial optimization to locate\nthe most strategically advantageous position. The method is described both verbally and visually on a map.\nAs another example, when a person is lost,\nthey can usually describe their spatial context in a fragmentary form.\nAfter listening to the description, we are able to reconstruct the context to spatially pinpoint where the lose person is located.\n  \n\n\nSpatial Logic\nWhen we are looking at a logical system, we can evaluate formula \\(A \\land B\\) by evaluating\nthe truth or falsity of the entities \\(A\\) and \\(B\\) separately. The \\(\\land\\) acting as\nan logical AND operator defined within the logical system, specified both\n\\(A\\) and \\(B\\) must be true; then \\(A \\land B\\) will evaluate to true. If \\(A\\) and \\(B\\) are both\nspatial entities, we are extending the formula with a spatial structure determining if\n\\(A \\land B\\) in \\(S\\) is possible. If it is possible, evaluate \\(A \\land B\\) can be true within the space \\(S\\).\n\nAssume a set of spatial entities \\(A=\\{A_1, A_2, ..., A_n\\}\\)\nwe can assert a spatial structure\n\\(S\\) will be true for set \\(A\\). We call this set \\(A\\) as the possible spatial configuration of \\(S\\).\nTo put this definition into a concrete example; If we need to navigate from a location X to location Y on land,\nwe would assume there is a road network structure that\nconnects the two locations.\nWe can see the set \\(A=\\{X, ..., Y\\}\\) and\nthe road network connecting all elements of \\(A\\) will be \\(S\\).\n\nSometimes, the spatial context does not need to involve a spatial entity\noccupying a location. For example, the most familiar 2D Cartesian coordinate system is a spatial reference system.\nWith a chosen origin, e.g. in 2D (0,0), all possibilities of location are laid out explicitly. If there is a spatial entity\npositioned in this 2D Cartesian coordinate system, we refer to the entity’s\nlocation by measuring from the chosen origin. We shall come back to the\ndifference between implicit spatial location (topological relations)\nand explicit spatial location (geometrical relations) in future discussion.\nWe shall assume we are referring to implicit spatial location (topological relations) in this article.\n\nPuzzles that required Spatial Logic\n“What does a system look like when it has spatial reasoning ability?”\nThis is best answered by an example.\nBy writing a Prolog program, we will illustrate the possibility of\nsolving spatial logic puzzles automatically.\nThe following is a typical spatial logic puzzle that is often found in puzzle books.\n\nSolving Spatial Logic Puzzle\nFive married couples live in houses shown on the map below.\n\n\n\n\n  Claire lives further east than Walter, who isn’t married to Mary.\n  Walter lives further west than Debbie, who lives further south than Bill.\n  Sandra lives further north than Katie, who l..."
  },
  
  {
    "title": "Recognizing Snacks using SimpleCV",
    "url": "/recognizing-snacks-using-simplecv",
    "date": "May 05, 2016",
    "categories": ["post"],
    "tags": ["SnackWatcher","SnackClassifier","Python","SimpleCV","Computer Vision","Machine Learning"],
    "excerpt": "\nThis article aims to provide the basic knowledge of how to recognize snacks by\nusing Python and SimpleCV. Readers will gain practical programming knowledge via\nexperimentation with the Python scri...",
    "content": "\nThis article aims to provide the basic knowledge of how to recognize snacks by\nusing Python and SimpleCV. Readers will gain practical programming knowledge via\nexperimentation with the Python scripts included in\nthe Snack Classifier open source project.\n\nTo illustrate with a snacks recognition app, the Snack Watcher\nwatches any snacks present on the snack table.\nFor Snack Watcher to determine if there was an interesting event,\nit needs to process the image into a set of image “Blobs”. For each “Blob”, Snack Watcher\ncompares the “Blob” with it’s previous state to determine if the “Blob” was added, removed or stationary.\n\n\nMore interestingly, the Snack Watcher can be configured to recognize\nif that “Blob” looks like a particular kind of snack (e.g. cookie or candy).\nIn the following Snack Watcher captured image, we can see the snacks have been classified\nand the results are labelled with their kind, reporting the “NEW SNACKS DETECTED!” event\nonto a message channel.\n\n\n\nThe Magics of Vision and Classification\nThe Python computer vision’s framework, SimpleCV is an OpenCV wrapper,\nthat makes processing, detecting, and displaying image simple, so that\nwe don’t need to invest great effort to achieve interesting results.\nFor machine learning and data classification, SimpleCV is built on top of the Orange framework.\nOrange implements a rich set of image extraction operators and\nmachine learning algorithms to support our goal - the snack classification.\nFor setting up OpenCV, SimpleCV and Orange for a Raspberry Pi project, please refer to this blog post\nRaspberry Pi 3 for Computer Vision\nfor instructions.\n\nThe best way to learn about snack classification is trying it out.\n\nThe Snack Classifier is an open source project\nfor snack classification. It comes with two Python scripts to help experiment with\nthe system. Once the concepts are affirmed after experimenting with the scripts,\nthe Snack Classifier can be used as a Microservice to support your snack classification tasks;\nextending the overall goal of effective snack watching!\n\nPython Scripts for Supervised Machine Learning\nThe Python scripts described in here can be used as a Supervised Machine Learning tools.\nThe typical supervised machine learning will go through two stages (see picture below).\nDuring the first stage (a) Training,\nthe computer is presented with example inputs and their desired outputs label (aka. class, such as cookies),\nrunning through a set of feature extractors to summarize the input as a vector of features.\nWith the support of the SimpleCV machine learning algorithms,\nthe system generates a classifier model which can\neffectively classify the training inputs.\n\nThe second stage (b) Prediction, can take a unknown input sample,\nrunning through the same set of feature extractors to summarize the unknown as a\nvector of features. With the classifier model obtained from training, the unknown\ninput can be classified as the result label (e.g. cookies).\n\n\n\n(a) Snack Training (Python Script)\nThis script has been inspired by this article,\nA fruit image classifier with Python and SimpleCV.\nWe have enhanced the article’s code snippet to make an easy to use\nSnack Training and Classifier scripts.\n\nsnack-trainer.py\n\nUsage: snack-trainer.py [options] -c &lt;classes&gt; -m &lt;method&gt;\n\nOptions:\n  -h, --help            show this help message and exit\n  -g, --debug           debugging mode\n  -c CLASSES, --class=CLASSES\n                        detect classes, comma seperated\n  -a TRAIN_PATH, --train=TRAIN_PATH\n                        training samples path\n  -t TEST_PATH, --test=TEST_PATH\n                        testing samples path\n  -r RESULT_PATH, --result=RESULT_PATH\n                        testing results path\n  -s CLASSIFIER, --classifier=CLASSIFIER\n                        using classifier (svm|tree|bayes|knn)\n  -f FEATURE_PATH, --feature=FEATURE_PATH\n                        save training features into file\n  -e CLASSIFIER_FILE, --save=CLASSIFIER_FILE\n                        save classifier into file\n\n\nThe default path containing all the training images is train,\nthe default path containing all the testing images is test and\nthe default path where the results are written is result. Of course, you\ncan change the directories with the corresponding commandline options.\n\nAll image classes must be layout into the separate directories.\nThe class labels must be aligned to the directories.\nFor example, if we have 2 classes cookie and other, the directory layout should be like,\n\n  train\n  |____cookie\n  | |____cookie_001.png\n  | |____cookie_002.png\n  | |____ ...\n  |____other\n  | |____other_001.png\n  | |____other_002.png\n  | |____ ...\n\n  test\n  |____cookie\n  | |____test_001.png\n  | |____test_002.png\n  | |____ ...\n  |____other\n  | |____test_003.png\n  | |____test_004.png\n  | |____ ...\n\n\nAfter we setup all the image directories, we can execute the snack-trainer.py by,\n\npython snack-trainer.py -g -c \"cookie,other\" -s tree\n\n\nIn this example, we are classifying cookie an..."
  },
  
  {
    "title": "Snack Watching with Raspberry Pi",
    "url": "/snack-watching-with-raspberry-pi",
    "date": "Apr 28, 2016",
    "categories": ["post"],
    "tags": ["SnackWatcher","RaspberryPi","Python"],
    "excerpt": "\nStarting as a fun Jonah Group project,\nthe Snack Watcher is designed to watch the company’s “Snack Table”. If there are\nsome new “Snacks” presented on the “Snack Table”, it can be used to report t...",
    "content": "\nStarting as a fun Jonah Group project,\nthe Snack Watcher is designed to watch the company’s “Snack Table”. If there are\nsome new “Snacks” presented on the “Snack Table”, it can be used to report the\nevent onto chat channels, emails or messages saying “Snack Happened!”, posting\nan image and trying to classify the snacks that it observed. It supports both as\nweb site for interactive snack viewing and RESTful API for programmatic snack querying.\n\n\nSnack Watcher Github Repo\n\n\n  Webcam connected to watch at the “Snack Table”\n\n\n\n\n\n  snack-web captured image sample with blob status (green means New, red means removed) and blob classification (they looks like “package” from classifier training)\n\n\n\n\nsnack-web\nsnack-web is a web application showing the result of snack watching, which has\nbeen designed to configure and run on a Raspberry Pi 2 or 3. snack-web can\nbe driven, either manually (via Web) or programmatically (via RESTful API) to\ntake pictures and push the snapshots into the static/images directory. Alternatively, the\nRESTful API can be programmatically used to watch and to return the images. The API is\na key feature to integrate with a external system, providing utilities to\nreport the snacks status.\n\nsnack-web Front Page\nThe following illustrated the front page of snack-web, the front page menu items are listed:\n\n\n  Links: display the last N snack captured image and it’s processing stages\n  Calibrate: take a background image for calibrating the background colour\n  Snap: snap a snack image from the camera now\n  Teach: (Require advanced setup) Currently still under heavy development, the teaching module is designed to interactively classify snack for future training. This required classifier setup to work.\n\n\n\n\nFor each snack image capture, it collects the set of processed images for debugging. User can understand how the snacks are identified. For each blob that the system detected, it is stored for displaying and for training. The colour coded blob represent, green is the new blob, yellow is the stationary blob, and red is the removed blob. By click on each image bar, a larger image is shown for detail inspections.\n\n\n\nRESTful API\nhttp://snack-web:8000/api\n(Replace snack-web with your host location.)\n\nThe images and operations can also be accessed via RESTful API. The available URI resources are listed in this table.\n\nTable: snack-web RESTful API\n\n\n  \n    \n      API\n      HTTP\n      Description\n    \n  \n  \n    \n      /snacks/\n      GET\n      return all images, that could be a lot of images\n    \n    \n      /snacks/snap\n      GET\n      take a snapshot and return the latest image. This call takes a snapshot and returns the processed image.\n    \n    \n      /snacks/id/{id}\n      GET\n      return image {id}. This call gets an image by the database id. If it is not found, null is returned.\n    \n    \n      /snacks/state/{class_state}\n      GET\n      Get all blobs matched the given class_state. This call gets a list of blobs filtered by c1ass_state.\n    \n    \n      /snacks/state\n      PUT\n      Update blobs state info by _id. This call accepts a list of (id, c1ass, c1ass_state) objects, updates their associated blobs in the database.\n    \n    \n      /snacks/class/names\n      GET\n      Get the list of class names. This call returns a list of the class names that a blob can be classified by.\n    \n    \n      /snacks/last\n      GET\n      Get the last image. This call returns the latest image by date_created DESC. If none exist, null is returned.\n    \n    \n      /snacks/last/{int:n}\n      GET\n      Get the last n images. This call returns a list of the latest n images by date_created DESC.\n    \n    \n      /snacks/last/summary\n      GET\n      Get the latest summary. This call returns a summary of the latest processed images including the new, duplicate and removed blobs. If no images exist, it returns null.\n    \n  \n\n\nSnapshot Naming Convention\n\nWhen a camera snapshot is taken, The image will be written into a creation folder according to the snapshot’s date-time,\n\nsnack-{year}_{month}_{day}-{hour}_{minute}_{second}\n\n\ne.g. snack-2015_06_17-13_14_58 is created at date 2015-06-17 and time 13:14:58.\n\nThe result JSON for an image, for example, requests for the last image using curl command.\n\ncurl http://snack-web:8000/api/snacks/last\n\n\nFor a list of images, for example, requests for all snack images using curl command.\n\ncurl http://snack-web:8000/api/snacks/\n\n"
  },
  
  {
    "title": "Raspberry Pi 3 for Computer Vision",
    "url": "/raspberry-pi-3-for-computer-vision",
    "date": "Apr 23, 2016",
    "categories": ["post"],
    "tags": ["AI","Computer Vision","RaspberryPi","OpenCV","SimpleCV","Python"],
    "excerpt": "\nWith Raspberry Pi 3, developing a computer vision project is no longer difficult nor expensive. Computer vision is a method of image processing and recognition that is especially useful when appli...",
    "content": "\nWith Raspberry Pi 3, developing a computer vision project is no longer difficult nor expensive. Computer vision is a method of image processing and recognition that is especially useful when applied to Raspberry Pi. You could produce your IoT with computer vision components, to secure your home, to monitor beer in your fridge, to watch your kids. Once you have an initial setup, the possibilities are endless!\n\nThis article summarizes how to setup your Raspberry Pi 3, how to install the useful computer vision libraries from OpenCV and SimpleCV, how to install the machine learning framework Orange. Equipping with this software tool suites, plus\nRaspberry Pi 3 has Wifi, Bluetooth and optional OpenGL built-in,\nyour vision project will be on it’s way to reality.\n\n\n\n\nInstall Raspberry Pi 3\nYou need to Download Raspbain Jessie disk image.\n\nAfter downloaded the Raspbain disk image, on Mac, we can use Pi Filler to build the SD-card image.\n\nAfter first boot, run sudo raspi-config do the following:\n\n\n  Expand the root file system to use the full SD-card (first option)\n  Enable SSH (advance option)\n  edit /etc/hostname to give a useful host name, e.g. my-pi\n  edit /etc/hosts to point 127.0.0.1 to the hostname, e.g. my-pi\n  reboot sudo shutdown -r now\n\n\nwe need to run the installation script with root privileges as we will be writing to an SD card.\nIn some distributions you can do this by prefixing the command with sudo whereas in some\nyou will need to su root. You should consult your OS documentation for more help on this matter.\n\nBut in the case of a standard Debian install, one would run:\n\nsudo apt-get update\nsudo apt-get upgrade\n\n\nNetworking Setup\nIf you want to setup Wifi, Bluetooth, this MakeUseOf guide on How to Upgrade to a Raspberry Pi 3\nwill be invaluable resource.\n\nVNC Server\nIf you want to setup remote desktop access to the Raspberry Pi, the following is an excellent guide:\nHow to control your raspberry using mac on-board tools (VNC-Connection)\n\nInstall OpenCV and SimpleCV\nOpenCV is a C++ library of programming functions mainly aimed at real-time computer vision. SimpleCV provides a wrapper with many “batteries included” features, such as integration with the OCR Tesseract or the well known Orange machine learning framework.\n\nLearning from this Install Notes,\nit describes a super easy and fast way to setup your Raspberry Pi with OpenCV with SimpleCV module, avoiding many painful steps described by others blogs.\n\nSimply run the following commands to install the OpenCV necessary dependencies:\n\nsudo apt-get install python-setuptools\nsudo apt-get install python-pip\nsudo apt-get install ipython python-opencv python-scipy python-numpy python-pygame\n\n\nAfter all OpenCV dependencies are installed,\nwe could proceed to install SimpleCV, a wrapper API that built on top of OpenCV and\nmake computer vision really easy.\nDownload SimpleCV from github and install from the source.\n\nsudo pip install https://github.com/sightmachine/SimpleCV/zipball/master\nsudo pip install svgwrite\n\n\nAfter allowing those commands to run for a while (it is going to take a while, go grab a drink),\nSimpleCV should be all set up. Connect a compatible camera to the board’s usb input and open up the simplecv shell.\n\nraspberry@pi:~$ simplecv\n\nSimpleCV:1&gt; c = Camera()\n\nSimpleCV:2&gt; image = c.getImage()\n\nSimpleCV:3&gt; image.save('test.jpg')\nSimpleCV:3: 1\n\nSimpleCV:4&gt; exit\n\n\nCheckout the result image test.jpg to see it captures correctly.\nYou have confirmed that Raspberry Pi is now running SimpleCV and working with your USB camera.\n\nInstall Orange - Machine Learning Tools\nOrange http://orange.biolab.si is a component-based data mining software. It includes a range of data visualization, exploration, preprocessing and modelling techniques. We shall discuss more about Orange in the future articles.\n\nDue to the limitation of Raspberry Pi, we’re having a hard time to get Orange framework compiled on Pi. You may want to run the machine learning component of your project on a Linux PC (Ubuntu 12.04) in this moment.\n\nYou can clone from the Orange Github Source\n\ngit clone https://github.com/biolab/orange.git\n\n\nTo build and install Orange you can use the setup.py in the root orange directory\n(requires GCC, Python and numpy development headers). If you follow the previous steps to install\nOpenCV and SimpleCV, you already have all the required software packages on your Raspberry Pi.\n\nTo use it unpack the nightly sources and run:\n(warning: the compilation will take hours to complete!)\n\npython setup.py build\nsudo python setup.py install\n\n\nThis will also install orange-canvas script so you can start Orange Canvas from the command line.\n\nTo install orange locally run:\n\npython setup.py install --user\n\n\nThis will install orange in /home//.local/lib/pythonX.Y/site-packages/orange/.\n\nComputer Vision Projects for Inspirations\n\nA Fruit Classifier\nAn inspiring image recognition project that classifies fruits.\n\nhttp://jmgomez.me/a-fruit-image-classifier-with-python-a..."
  },
  
  {
    "title": "Using Prolog to Solve Logic Puzzles",
    "url": "/using-prolog-to-solve-logic-puzzles",
    "date": "Apr 21, 2016",
    "categories": ["post"],
    "tags": ["AI","Prolog","Logic","Puzzle"],
    "excerpt": "What is a logic puzzle?\n\nLogic Puzzle is a very funny thing. We are all very interest to read and try our brains to solve\n1 or 2 of these puzzles. We thought that would improve our brain power afte...",
    "content": "What is a logic puzzle?\n\nLogic Puzzle is a very funny thing. We are all very interest to read and try our brains to solve\n1 or 2 of these puzzles. We thought that would improve our brain power after proving that we are\nlogical enough to solve these logic puzzles. Of course, we are having fun for doing that too. But\ndoes it really improve our brain function, or this simply shows our brain is incapable to handle\nlogical jumps in large recipe, like 6 houses, 5 couples and 7 kinds of tea. Obviously, we can not\nhold such a large logical space in our head, 6 x 5 x 7 = 210\npossible solutions in this case. We need to resort to use some external recording\ndevices to help us organized the “vast” information. Through the process of elimination and the\ndeductive reasoning, we would come to a possible answer.\n\n\n\n\n  [2024/10/05] We can listen to this article as a Podcast Discussion, which is generated by Google’s Experimental NotebookLM. The AI hosts are both funny and insightful.\n\n\nimages/using-prolog-to-solve-logic-puzzles/Podcast_Prolog_Logic_Puzzle.mp3\n\n\n\n\n\n\nAfter you solve a few puzzles, the natural tendency of a computer scientist is, “why don’t we\nautomated to solve these puzzles?” Of course, your computer scientist’s inclination is always right. After a few rough analysis, we\ncan see the puzzles are a search problem. If we are dumb enough to try, we can exhaustively list\nall possible answers and go back to the list of constraints to test if any one of the answer\nwill satisfy all of the constraints. If the answer produces no contradiction, we know we have the\nright answer. However, our hand and brain are usually too lazy to do this type of dumb search.\nThe puzzle book producer helps with giving you a table of all combinations so that you can pretend\nto not doing a dumb search. You are actually writing down something to eliminate some obvious\ndumb choices. The logical combination table serves as a smarter-dumb search device. This makes us\nfeel really good.\n\nHere is the popular\nGrid Method\nusing by many puzzle enthusiasts and available in magazines dedicated to the subject.\n\n\n\nPutting those checkmarks to indicate something is possible or could be eliminated, that’s involve some language reading skills. If we could not reach a solution, we are not sure if we are reading all the English sentence properly after all. We would run through the original statements to verify if there is any contradictions in our possible solution.\n\nLet’s be more serious, a computer scientist, when they are facing a logical problem, they would\nuse a programming logics language, namely PRO-LOG, to tackle with the problem. This is a good idea\nuntil when you actually start to write some Prolog codes. You may ask, what is the problem? I said,\ndid you actually try to do it. The problem is encoding a logic puzzle is more difficult than\nsolving it. Luckily, the consequence of spending infinite amount of time to code is the expectation\nto solve the later problems faster. Under the amortization principle, we are coming out with a gain.\n\nThere is one professor in the world, Mihaela Malita takes it seriously enough to supply an extremely useful Prolog Puzzle Solving library. I am really thankful to her because it proves that automated puzzle solving is an interesting and a programming skills to have.\n\nThe Zebra Puzzle\nThe famous Zebra Puzzle comes with 15 facts and 2 questions:\nWho has a zebra and who drinks water?\n\nThe list of facts (or constraints):\n\n\n  There are 5 colored houses in a row, each having an owner, which has an animal, a favorite cigarette, a favorite drink.\n  The English lives in the red house.\n  The Spanish has a dog.\n  They drink coffee in the green house.\n  The Ukrainian drinks tea.\n  The green house is next to the white house.\n  The Winston smoker has a serpent.\n  In the yellow house they smoke Kool.\n  In the middle house they drink milk.\n  The Norwegian lives in the first house from the left.\n  The Chesterfield smoker lives near the man with the fox.\n  In the house near the house with the horse they smoke Kool.\n  The Lucky Strike smoker drinks juice.\n  The Japanese smokes Kent.\n  The Norwegian lives near the blue house.\n\n\nWe represent the houses as a list with 5 lists from left to right:\n\nSol = [[Man, Animal, Cigarette, Drink, Color], [..],[..],[..],[..] ]\n\nWe are using SWI Prolog\nto implement the zebra puzzle. Relying on the puzzle library bibmm.pl (download from Prolog Puzzle Solving library), we are translating the constraints into Prolog code as following,\n\n:-consult('bibmm.pl').\n\nstart(Sol):- length(Sol,5),                 % 1\n    member([english,_,_,_,red],Sol),        % 2\n    member([spanish,dog,_,_,_],Sol),        % 3\n    member([_,_,_,coffee,green],Sol),       % 4\n    member([ukrainian,_,_,tea,_],Sol),      % 5\n    right([_,_,_,_,green],[_,_,_,_,white], Sol),    % 6\n    member([_,snake,winston,_,_],Sol),      % 7\n    member([_,_,kool,_,yellow],Sol),        % 8\n    Sol= [_,_,[_,_,_,milk,_],_,_],          % 9\n        Sol=..."
  },
  
  {
    "title": "Using Pharo to Learn Smalltalk",
    "url": "/using-pharo-to-learn-smalltalk",
    "date": "Apr 18, 2016",
    "categories": ["post"],
    "tags": ["Smalltalk","Pharo","RaspberryPi"],
    "excerpt": "\nPharo is an open source implementation of the programming language and environment Smalltalk. Pharo emerged as a fork of Squeak, an open source Smalltalk environment created by the Smalltalk-80 te...",
    "content": "\nPharo is an open source implementation of the programming language and environment Smalltalk. Pharo emerged as a fork of Squeak, an open source Smalltalk environment created by the Smalltalk-80 team. This article explores how to use Pharo to learn Smalltalk, using Pharo unique package management tools and running Pharo on Raspberry Pi.\n\n\nInria is the company who produce Pharo and many related software anaysis tool. You can find the list here:\n\n\n  Inria Softwares List\n\n\nPharo Video Tutorial\nHere are some recommended Pharo learning tutorials:\n\n\n  Learning Pharo\n  Learning more Pharo\n\n\nPharo Books\nPharo team produced an excellent free book, that you can download from here:\n\n\n  Deep into Pharo Book\n\n\nPresentation\n\n  \n    There is a nice introduction to Smalltalk and Pharo environment by Marcus Denker\nPharo Objects at Your Fingertip and\nPresentation Slides\n  \n  Tudor Girba - Pharo: Playing with Live Object\n(where AtomMorph demo are showing)\n    \n      Playing with Live Object\n    \n  \n  Laurent Laffont - Manipulating Objects\n(where AtomMorph demo are showing; this video is technical details)\n    \n      Manipulating Live Objects\n    \n  \n\n\nPharo is the cool new kid on the object-oriented languages arena. It is Smalltalk-inspired. It is dynamic. It comes with a live programming environment in which objects are at the center. And, it is tiny. But, most of all, it makes serious programming fun by challenging almost everything that got to be popular. For example, imagine an environment in which you can extend Object, modify the compiler, customize object the inspector, or even build your own the domain-specific debugger. And, you do not even have to stop the system while doing that.\n\nMonticello\nTons of packages can be found at:\n\n\n  Smalltalk Hub for Shared Packages\n\n\nAfter finding what you like, you can use Monticello Browser to add the package.\n\n\n\nThe repository can be added, for example\n\nMCHttpRepository\n\tlocation: 'http://smalltalkhub.com/mc/PharoExtras/MorphExamplesAndDemos/main'\n\tuser: ''\n\tpassword: ''\n\nDemo with BouncingAtomsMorph\n\nThe tutorial steps on using AtomMorph Demo are repeated here. For a newcomer on learning Smalltalk,\nthis AtomMorph Demo illustrates many amazing features of the Smalltalk dynamic nature and it’s\ndevelopment environment.\n\n\n\n\n  BouncingAtomsMorph new openInWorld\n  meta-click&gt;open explorer\n  select some submorphs&gt;AtomMorph and inspect it\n  self color: Color red\n  self velocity: 0@0\n  drag out the red AtomMorph\n  make it larger\n  copy it\n  inspect it\n  self velocity: 2@3\n  drag it back to the BouncingAtomsMorph\n  click on the red Menu handle and embed it\n  self browse the AtomMorph\n  browse velocity:\n  browse senders of velocity: -&gt; browse #bounceIn:\n  modify to beep after bounce:\n\n\nbounced ifTrue: [self velocity: vx @ vy. Beeper beep ].\n\n\n  very noisy, so add test for color red\n\n\nbounced ifTrue: [self velocity: vx @ vy.\n\tself color = Color red ifTrue: [Beeper beep] ].\n\nAlternative:\n\n\n  create a subclass of AtomMorph with a different color that beeps\n  define BeepingAtomMorph\n\n\nbounceIn: aRect\n\t| bounced |\n\tbounced := super bounceIn: aRect.\n\tbounced ifTrue: [ Beeper beep ].\n\t^ bounced\n\ndefaultColor\n\t^ Color red\n\n\n  BeepingAtomMorph new openInWorld\n  instantiate it and embed it\n  find senders of bounceIn:\n  see BouncingAtomMorph»step tests AtomMorph class\n  we can change BouncingAtomMorph»step or BeepingAtomMorph as follows:\n\n\nisMemberOf: aClass\n\t^AtomMorph == aClass\n\nShow all Morph Instances\n\nBouncingAtomsMorph allInstances.\n\nRemove all Morph\nInspect BoundingAtomsMorph object and execute,\n\nself removeAllMorphs\n\nMetacello\nMetacello is a package management system for Monticello (a versioning system used in Smalltalk). There is a chapter about Metacello in the “Deep into Pharo” book, and it gives a good in-depth knowledge about this system. On the other hand when I was starting to use Metacello, I needed something more simple and direct, like what I described here.\n\nIn Pharo, Metacello is presented as Configuration Browser, which you can use to easily install more packages.\n\n\n\nInstall Packages\n\n\n  \n    \n      Package\n      Description\n    \n  \n  \n    \n      Roassal2\n      Roassal graphically renders objects using short and expressive Smalltalk expressions. A large set of interaction are offered for a better user experience. Painting, brushing, interconnecting, zooming, drag and dropping will just make you more intimate with any arbitrary object model. Documentation is here Roassal2 Documentation\n    \n    \n      NeoJSON\n      JSON (JavaScript Object Notation) is a popular data-interchange format. A number of implementations of this simple format already exist. NeoJSON is a more flexible and more efficient reader and writer for this format. Documentation is here NeoJSON Paper\n    \n    \n      NeoCSV\n      CSV (Comma Separated Values) and more generally other delimiter-separated-value formats like TSV are probably the most common data exchange format. A number of implementations of th..."
  }
  
]
