3D Capture – Semester Wrap Up

Our project on 3D motion capture was certainly a learning experience.  Although we have yet to make models of phenomena over time, we have certainly improved on our skills in data collection (photographic techniques), post-processing, and model construction and refinement in Agisoft Photoscan.

Data Collection

Initially, we took photos of objects indoors lit by a halogen light, using a 30-110 mm lens. As a result, we struggled with a few issues:

  • Due to the directional light, it created diffuse shadows over the model, which made it difficult for Photoscan to align the camera angles.  In addition, the hue of the light altered the appearance of the final generated textures for the object (everything was tinted yellow).
  • We were attempting to model white-ish objects on a white counter top, which also made it difficult to align camera angles.  With lack of apparent textures and similarities in color, it made it more difficult to distinguish between the model and the environment.
  • With a 30-110mm lens, we zoomed in close to the object so that it took up the entire frame of the image, but due to our placement of the camera this created a very narrow depth of field. As a result, if we focused on one piece of the object, other parts of the object would appear blurry, and would interfere both with dense point cloud generation and texture generation. (In the Depth of Field and Color Issues link, you can see how the sword and arm are in focus, but more distant features like the head are out of focus)
  • The importance of proper coverage is still something we’re working with. We’ve often forgotten to take high shots of the top of an object to be left with an excellent looking model with a hole in the top of the mesh which can be quite frustrating. This is also true of gaps in the model, such as bent arms or areas between the legs, coverage of these areas is essential to prevent holes or conjoined parts of the model.

Depth of Field and Color issues

Essentially, with only few images taken in this fashion it would result in models that we’ve dubbed “pudding monsters” due to the lack of well-defined edges and features.

No Texture Pudding

Eventually, we’ve found out that shooting outdoors in overcast conditions provide excellent lighting, as there is uniform lighting all around the object to prevent diffuse shadows on the object. This, in conjunction with more apparent surface detail, allowed for much better alignment and mesh generation. We also found that taking close up shots after getting general coverage really improved the quality of the mesh and texture; it seems that one can mix different kinds of shooting eg. panning and circling as long as there is sufficient overlap between images to obtain alignment. On the note of alignment, we really should be using markers. They speed up the alignment processes immensely and they also help alignment where there is ambiguous overlap. We had to experiment with manually inserting markers into the datasets to get cameras to align in some cases, and while we were able to obtain full camera alignment using this method, it was very time consuming and not something that is viable for constant use. Some object have very difficult times aligning and I’ve had issues getting bushes and large trees to align properly, sometimes only aligning 4-6 cameras out of a 200+ image set. Again, printing out markers could help immensely with this.

Power_pole

Post-Processing

Our final improvements to the project this semester were in editing photos before importing them into Photoscan.  Initially, we did no photo manipulation, but later we found that we could perform preprocessing on the photos to correct contrast, lighting, and exposure of the original images. We used this approach on the Mother Nature model from Epic (not shown here) that was shot indoors in directional light. The corrections may have helped with alignment accuracy and depth calculation, but the resulting textures were non uniform due to corrections being greater in certain angles than others. Overall the model turned out well, but there are certainly room for improvements. We are currently using the automatic mode on the camera for very quick modeling, but we wonder if it would be better to use manual exposure settings to maintain aperture, ISO, and shutter speed between photos. This manual constancy should allow for very even textures and consistent exposure settings between photos, but this approach would only be viable for diffuse and well lit subjects.

Model Construction and Refinement

Our biggest mistake we made in earlier stages of the project was constructing meshes from sparse point clouds.  Essentially, we were only constructing models by limited datapoints provided when camera angles are aligned.  With dense point cloud generation, Agisoft uses photogrammetry (which Andrew and Bryce should learn about next semester in CS 534 – Digital Photo Computation) to construct data points of key features and interpolate values between them, creating a much more detailed data set.  As a result, our models turned out much better, even with low-detailed dense point cloud generation.

For example, we were able to achieve good results with our Gundam model after manual camera alignment (due to lack of use of tracking markers) and ultra high dense point cloud rendering. However, this was very time consuming and the end model lacked the fine surface detail of the source and the surfaces tended to be very pitted with few smooth planes. We think that this was due to our use of a very shallow depth of field. Our thoughts are Photoscan found the corresponding points in the two images, but in one it was in focus and in the other out of focus, this problem compounded from many angles would likely create a model that has a mix between these two and thus perhaps errors in the depth calculations. In the future we would like to repeat this with a very wide depth of field maintaining focus on all the parts of the model and we expect superior results.

Model from High-Quality Dense Point Cloud (~1,100,000 Vertices)

Model from Low-Quality Dense Point Cloud (~30,000) Vertices

In addition, we refined the point clouds by manually trimming outlying data points, which created a much more accurate mesh.  This, accompanied with all prior techniques, allowed us to create a very nice model quite easily:

Right-Side Angle Back Angle (Holes in the mesh are from lack of photo coverage) Left-Side Angle Front Angle

Here the models are exported from Photoscan as .obj files, and are processed into Javascript files and viewed with WebGL.  This just provided an alternative way to view models outside of Photoscan. (We are unsure how to embed HTML files with external Javascript files into WordPress, so that we could view the model from all angles instead of images).

In Conclusion

We’ve had our share of issues, and for the most part, have worked through them. We feel confident in the ability to get decent models, and are looking forward to more experimentation with DoF, markers, and using multiple chunks to obtain high quality on small detailed areas of models, which we will investigate this summer and possibly next fall. We also plan to make models of more imposing objects, such as really small/large structures, and possibly begin looking into making models of dynamic events (starting simple, of course).

We also have ideas of finding “uses” for the models, such as importing them into a game engine or 3D printing them, but we’ll have to see where it goes!

-Andrew and Bryce

 

Week 1 Modular Composition

I am back. It’s been a few months since I last coded in Processing and this represents a new opportunity for me to delve right back into particle systems.

This semester, however, I would like to take it to the next level by making Processing compositions that are “architectural” in nature. In any case, I needed to do some review of the major concepts of object-oriented programming.  The majority of the week I spent watching Daniel Shiffman’s very recent YouTube channel where he provides supplementary lectures to the two books I used in the Fall to learn Processing.

So I wanted my first script to be somewhat angular and modular. Modularity will be a big theme this summer during this study as I set out to model expansive urban sceneries. The best way to accomplish this is through the use of module set pieces that are then arranged, repeated and combined to create buildings. This technique is used to create the diverse open worlds of popular videogames like Fallout and Grand Theft Auto.

To do this, I thought of a random walker example in the beginning of Shiffman’s the Nature of Code. We can set a variable to be randomly chosen and have that variable determine whether the walker makes a step up, or down, left or right. In order to constrain the direction, I made this small function

void changeDir(){
    float r = random(0, 1);
    float newAngle = angle;
    if(r > 0.5){
      newAngle+=90;
    }else{
      newAngle-=90;
    }

Depending on the value of ‘r’, we can set the velocity direction of each new particle to be determined by sin(x) and cos(y)

velocity = new PVector(sin(radians(newAngle)), cos(radians(newAngle)));
    angle = newAngle;
  }

This confined the trace of the particles to be linear and in one of the 4 cardinal directions, so you end up with particles tracing a network of squares. Pretty neat!

You can see my script here http://www.openprocessing.org/sketch/361605

See you in another post this week.

 

Sparse Photo Angles – Bryce and Andrew Update

(For some reason embedding images is not working, but images are visible on clicking)

Texture

4 Positions 3 elevations Good Alignment

No Texture Pudding

4 Positions 3 elevations Good Alignment No Texture

We wanted to look at how the sparseness of the photo angles affected the resulting model. The goal of this was to simulate the possible camera configurations we would have for high speed setup. Given that a 360 model would require many more cameras than we have available we decided to target only one side of the model to maximize overlap. We found that the best results could be obtained via close camera angles and multiple elevations, however these were not sufficient to preserve much detail in the final model.  This is most evident in the base model, once it is textured the models tend to look deceptively better. Dense point clouds are very computationally intensive to generate. For a 3D model with 9060 faces the dense point cloud is over 10,000,000 points. We also found that separation from the background was difficult to maintain with the few angles.

2 position 2 elevations with texture applied

2 position 2 elevations, texture applied

2 position 2 elevation no texture maps

2 position 2 elevations, no texture

With 4 images from the same elevation we were only able to get 56 points of correlation and no surface formation.

Empty Surface, 4 images same elevation

The feasibility of highs speed capture with any significant detail has been called in to question with these experiments. The high resolution of the source images (14MP) as well as the good exposure of the images leads to sub par results in maintaining detail in the model. Thus with the much lower resolution and difficult exposure of high speed capture leads us to suspect that models produced in such a manner would have very limited to no ability to discern detail.

We would like to meet up with Kevin at some point to discuss possible different project goals.

 

chat Update 3

I have now decided to use Photon’s suite of services to drive both Audio and Text chat. Photon provides both the server backend and Unity plugins for implementing voice and chat texts. The service is free for under 20 concurrent users, so there’s no problem there.

I think I will try to stick with the card UI language, but I may experiment with a 3D manifestation like this walkie-talkie I sketched up. It would be cool to have a physical (virtual) object that would have a 3D sound layer when you brought it up to your ear, more of like a binaural sound than just hearing it from a source in front of you, much like the phone in Job Simulator. I would imagine that the indicator light on the top would light up when you have a new message and putting it up to your ear would play the message. The trigger on the back would enable the player to record a message or directly talk back. The Vive and Oculus CV1 both have microphones by the face, so it would be quite easy to record the voice.

Concept for a walkie-talkie

Concept for a walkie-talkie

I am currently in the process of setting up Photon and modeling the the walkie-talkie above to act as the point of interaction for the voice and am also trying to work out a way to adapt the custom emoji support in Photon to act as a quick way to send large emoji to another person.

chat Update 2

First, I’d like to apologize for not posting more frequently. I’ve run into some architectural issues that I’m trying to figure out before I move forward and start coding.

When trying to create a solution that has to interact with multiple different frameworks and platforms, things can start to get complicated. Let’s break things down and look at the options that we have for each.

Backend Framework

Angular.js

Node.js

Game Engine

Unity Engine

Unreal Engine

Input Solution

Text-to-speech

Voice Messages

Sending emoji (pictures)

 

Thankfully, we can take one factor out of the equation right from the start. I will be implementing everything through the Unity gaming engine. I will also be targeting the HTC Vive because of it’s great resolution and natural controllers.

Originally, I had wanted to test a system that wanted to test a variety of chatting methods in a lightweight way in a virtual environment. I think that for the sake of time and scope over the semester, I will focus on sending emoji and other images, mostly because of the fact that emoji is an emerging form of communication and I think it’s effects in a VR setting might show to be quote interesting.

When I was researching what was possible with emoji and a lightweight, over the internet P2P solution, I ran into a couple of issues. Currently, Apple’s version of emoji are the most ubiquitous and up to date of all of the current Emoji typefaces. Emoji standards and definitions are set by a Unicode. Companies and organizations create the font based off of these standards. If I were to continue using Apple Color Emoji, my best option would be to use exported PNG versions instead, as otherwise it would look different on each computer. I could also use EmojiOne, an open-source Emoji font as well. It will come down to whether or not sending text over a chat library is easier than sending images. I could also simply send a code between the users and assign that code to a corresponding PNG version when it reaches the second user. All are viable options at this point.

This next week I will be building a web-based version to try to test these different methods.

Until next time,

Tyler

High Speed Modeling – Update 2

-Andrew Chase and Bryce Sprecher

Progress

In the last week, Bryce and I managed to try constructing 3D models with Agisoft Photoscan, to familiarize ourselves with the software.  All in all, we were successful, and managed to find out what works well and what didn’t.  Here’s our results:

(Click photos for a larger view)

Paint_Tree

Outdoor lighting, or any environment with a lot of ambient light works best.  The less shadows/reflection we have in the modeling subject, the better.

Power_box

However, things can still go wrong, such as in this case when we tried to model a simple power box.  However, this was rushed, to see how rigorous we needed to be in our photo collection, and it still turned out alright.  One simple solution to this would be to just mask each image before aligning the photos, and we wouldn’t have background interference with the model.

Texturing seems to work well with Photoscan, as seem by this power pole.

Power_pole

We’ve tried a few indoor models, and found that we need to have a better lighting set up (light boxes, as opposed to generic LED bulbs/halogen lights, to increase diffusion).  This, and find a way to prop up our samples to allow below-horizon shots on the object, which we didn’t do on the controller example. We also tried a figurine model, but we had issues with it.  It was a small object, so it leads us to believe that smaller objects require a more sophisticated set up.  Bryce is very familiar with the camera (and knows a lot about optimal photography) at this point, and is able to adjust ISO sensitivity, shutter speed, and various other setting to optimize our data collection, so we are unsure why smaller objects are still difficult to model.

controller

Next Steps

Our next goal is to investigate better lighting techniques, and see how we can construct a model with limited camera angles.  If we are to model an object over time with only four cameras, for example, we need to see how the model can turn our with such limited perspective.  If anything, we may need to create a “one-sided model”, where it is 3D from one side, but has no mesh construction on the other.

In addition, we plan to look into how to synchronize cameras for simultaneous capture.  There is software/an app for this camera, and we plan to see what limitations there are on it.  If the software cannot handle such a task, we will then look into camera how we can use post-processing to align and extract frames from each camera’s perspective, and create models from that.

High Speed 3D Modeling – First Steps

Hello Bloggers!

This is Andrew Chase and Bryce Sprecher, and we’ve started working in the LEL this semester under the direction of Kevin.  We haven’t make any posts yet because most of our efforts have been focused on deciding exactly what we want to do, and how we are going to approach it.  Now that we’ve had a chance to start, here’s our plan:

High-Speed 3D Modeling Through Agisoft

Agisoft Photoscan is a software program that allows the reconstruction of 3D models through digital images.  We plan on starting with creating simple models of objects, and progressively moving onto taking high-speed videos of some phenomena (fluid dynamics, popping of popcorn, combustion of materials, etc) and constructing models of this process.  From there, we can play back the event in 3D (first by conventional means of 2D display, and hopefully Oculus/CAVE integration) in slow motion to relive the event up close and with detail.

We obtained a Nikon J4 Digital Camera with two different lenses (10-30mm, and 30-100mm), which is capable of capturing:

  • 1080p at 60 frames per second (fps)
  • 720p at 120 fps
  • 768×288 at 400 fps
  • 416×144 at 1200 fps

We’ve tested several different phenomena at different resolutions, and found that the 1200 fps capture would work great at obtaining high-speed imaging of solid objects, without much need for texturing detail.  If detail was needed, the 400 fps still captures high-speed action at very acceptable detail.  Here are some examples:

Dice Roll @ 400 fps

Egg Crack @ 400 fps

(NOTE: I’ve tried for a while to embed the videos, but it doesn’t seem to work on this blog)

With some post-processing, we believe we can further enhance the high-speed effect (reducing playback speed further, adding motion blur/fake frames, etc).  In addition, these were shot in low light environments, and it is apparent that a lot of light will be needed for this kind of capture (High fps = Shutter open for short time on camera = Less photons hitting sensor).

Our next steps involve trying to make some models and see how Agisoft Photoscan handles video input (whether it will handle a group of frames at once as a video, or will we need to do it manually, frame-by-frame).  If everything looks promising, we can look into getting more cameras in order to capture different perspectives of the phenomena to create a model.

chat – A lightweight tool for interacting with others in virtual reality.

Chat allows users in a virtual reality (VR) homespace a quick, easy, and fun way to send voice messages, stickers, and even presents to other VR users.

Rather than copying traditional messaging interfaces into a VR homespace, I will explore and execute upon methods that have emphasis on light and natural forms of communication.

The following examples show mockups for a four different types of communication that could work in VR:

  • Voice – Allows a user to capture a voice message and quickly send it to another user.
  • Text – Allows a user to either choose from predefined responses or use voice-to-text.
  • Emoji – Allows a user to select from pre-defined emojis or stickers in a quick way.
  • Present – Allows a user to send a virtual gift to instantiate in the other user’s scene.

chat Concepts-01

Mockup showing voice, including a microphone button that will allow them to record their voice.

 

chat Concepts-02

Mockup showing text, including a predefined message.

 

chat Concepts-03

Mockup showing emoji, allowing users to pick from a grid of reactions.

 

chat Concepts-04

Mockup showing present, including a 3D model of a dog to be sent to another user’s space.


 

Talking Points

  • Traditional 2D interactions versus rich 3D experiences
  • Implement a system of basic chat and simple 2D games across players
  • Explore different methodologies
  • Scaling from 2 people in a chat to a whole group of people and what that entails
  • How to share information and resources
  • How can people collaborate differently in VR compared to in what we have today?
  • Potentially measure the effectiveness of simple chat experiences vs. full 3D experiences
  • Does a “2D” social experience feel natural in VR?

Existing VR social spaces: AltspaceVR, VRChat, Convrge, vTime, Oculus Social (Alpha)

Deliverable: A collaborative social application (in Unity) (“2D”) that can run with two or more players and a short study on how people respond to the application and what steps can be taken to improve the application.

Progress report 4

Last couples of weeks I have worked on developing research ideas and planning out overall study procedures. This has been more specified after meeting with Prof. Ponto on Monday.

Given the time constrains, we decided to keep the overall procedure relatively simple for the STAR project.

This progress report 4 is organized as follows. First, I will discuss the series of behavioral data that can be measured and analyzed for the STAR project. Although original research plan was simply to compare shopper’s in-store behaviors in both 3D virtual store and actual store, I would like to develop the study further. On Monday meeting, we discussed other potential aspects to be examined in order to get the bigger picture of store design research. This would be what I will largely cover for the last half of this progress report.

The popularly accepted industry adage is that “unseen is unsold”. Consistent with this belief, retailers have developed the strategies to drag shoppers’ attention and expose them to more products. Suppose a shopper enters the university store to buy notebooks. When passing by electronics aisle, she remembers that her computer mouse got broken last week and need a new one. In-store stimuli such as products or promotion banners can guide shopper’s overall shopping paths as it triggers further wants and needs. Along these lines, for the STAR project, I would like to collect series of behavioral data that show how shoppers use physical products in the store as external memory cues that create new needs or trigger forgotten needs.

Specific plans for data collections are as follows.

In both conditions, participants will be asked to check the 3 product categories they planned to purchase during the current shopping trip. The main purpose of limiting the number of products in both conditions is to facilitate easier comparison in regard to the shopping paths deviation. However, this approach can invite some criticism. It would be necessary to think about valid reasons that can back up such manipulation.

Actual store 3D Virtual store
Attention (Noting) Fixation of head Fixation of head or mouse
Approach Move near to the product (distance) Move near to the product (distance)
Examination Pick up the item and checking the price Instructed to click/point the item they want to examine

Through meeting on Monday, we concluded that calculating reference/actual shopping paths can be rather complicated. In such case, we will compare the number of items that shoppers pay attention to, approach to and examine.

In order to develop this research further, I would like to examine other variables that can have impacts on in-store behaviors.

1) Store familiarity in terms of product location and layout

2) Shopping goal

The situation variable, shopping goal, affects in-store decision making in various ways. Along with the mind set and construal level theory, it is generally known that individuals who are in abstract state are more susceptible to environmental cues as undecided individuals do not know yet what they want to buy. Research consistently reports that shopper use physical products in the store as external memory cues that create new needs or trigger forgotten needs (Hui, Inman, Huang, and Suher, 2013; Inman and Winer, 1998; Park, Iyer and Smith, 1989). My central premise is that shoppers who are less certain of their goal would shop more, because they are more open to in-store stimuli and guided by external memory that triggers wants or needs. When consumer shops with concrete goal in their mind, product search should be guided by internal memory. In contrast, when consumers are not fully aware of what they want, search activities should be guided by external memory. This in turn will increase susceptibility to attractive marketing promotion for shopper.

With respect to shopping goal, I suppose following hypothesis can be tested in both actual and 3D virtual store.

Hypothesis 1: Compare to consumers who have concrete shopping goals, those who have abstract shopping goals browse and search product in a less selective manner.

Hypothesis 1a: Compare to who have concrete shopping goals, those who have abstract shopping goals pay attention to more diverse categories of products.

Hypothesis 1b: The probability that visual attention (noting) leads to following searching behavior (approaching and examination) is higher for consumers who have abstract overall shopping goal compare to consumers who have concrete shopping goals.

 

*) We also discussed the possibility of studying online shopping behavior. Surprisingly, there is no research that explicitly compare the searching behavior in offline and online shopping contexts to my knowledge. I will discuss it more about this on next post.

*) I got IRB reviews & comments. As it says that I need letter of permission to conduct the study at the bookstore, I got a signature from Duane and secured the permission today. I will be working on revising IRB for next couples of weeks.

Week of 12-18-2015

My Independent Study with Professor Kevin Ponto in this semester has come to a final stage. I would like to conclude my study with the following discussions:

My proposed study theme was “Manipulation of contemporary medical imageries”. The method of investigation was art practice. This kind of art practice project was not only a personal expression, but also rooted on studying how our human condition was viewed through a multi-disciplinary lens in art, design and medical science. My art practice comprises 2 parts:

  1. Create a physical artwork which addresses medical image mapping.
  1. Extend the theme with a video projection with basic mapping techniques to display MRI images.

In early November 2015, I completed all 2 parts. These two artworks became part of my MA Show in Mid-November, along with a performance piece performed by my music collaborator, Brain Grimm and myself.

Overall, this project is extremely fruitful. Though there were many obstacles during the process, I am very pleased of the experience and the outcome.

Hereby, I would like to discuss mainly on the architectural glass window installation. This is because the installation work represented best of my project goals. This installation piece took me the longest time and greatest effort. Yet. It is also the most rewarding one. I spent 2 months to conduct extensive research on:

  • Gothic stained glass (color scheme, method of narrative, pattern design, etc.)
  • Architectural environment of the Humanities Building, UW-Madison (Brutalism in architectural design, geometrical features of the 13 glass panels on north wing, dynamical change of visual characteristics depends on light setting from day to night and from interior to exterior, inhabitant behaviors, etc.)
  • Design of the window installation with medical images to tell my personal story (my own MRIs, chemical structure of human cell and drugs, medical symbols, etc.)
  • Use of materials  (Choice of see-through vinyl sheets, degree of transparency, degree of color intensity when light pass through, durability, UV protected, material cost, shipping cost, shipping timeframe, contingency plan, etc.)

I enjoyed the research process a lot. My knowledge on medical images and media design was enriched in a large extent. In addition, the installation occupies the 6th and 7th floor and was 18.5 feet high and 15.5 feet wide, with 13 window panels in different shapes and dimensions. It was absolutely challenging both in artistic sense and physical installation process. The outcome was very satisfied. The artwork was well-received by different parties. Besides learning and art-making satisfaction in this project, I also received practical rewards. This installation will continue to stay in the Humanities building. Moreover, I was awarded the 2016 David and Edith Sinaiko Frank Graduate Fellowships for Woman in the Arts in Mid-December. This award will allow me to continue my art project in the same theme.

In this sense, this particular installation has met 90% of what I originally planned for. I reached my goals with extra rewards. The 10% left was mainly about time management and budget constraint. The installation process required manpower exceeded one person. Ideally, I need at least 3 persons to work together for 1-2 days. In reality, I completed the installation one classmate in 2 full days. It is quite a miracle. In addition, the cost of materials was very high when I quoted the pricing locally. My final choice is to order the vinyl sheets from China-based printing company and had them shipped to me. In this case, I took the risk of quality control. Fortunately, the prints came out nicely. The 10% hurdles did not cause major drawbacks in the overall project.

As mentioned, I will move this project forward under the funding from the award I got. In my application of this award, I proposed that, “This project will served as a complete series of my architectural installation as well as part of my PhD (pending) preliminary requirement.” I am recently  moving forward to the area of media technology, especially application of virtual reality.

My next step is to consolidate my PhD program and really start! I plan to move into the research and practice process towards my PhD study direction.

In conclusion, I have excellent experience in this project. Professor Ponto supervised me on regular base with clear guidance. And he gave me enough trust and freedom so I was able to excel academically and artistically. I am looking forward to proceed to the next phrase under Professor Ponto’s advice.

 

Additional writing about contemporary artists who employed their own images as “medical body”. This writing was abstracted from my MA degree research:

“Autopathography of the Artist’s Own Medical Body”

(uploaded article from academic.edu)